Updates to README.md. This is a work in progress: updates to README.md are still ongoing.

This commit is contained in:
Michael Luciuk 2025-07-04 17:06:08 -04:00
parent b91b025062
commit 77179d38f3

200
README.md
View File

@ -1,27 +1,172 @@
# ModrecWorkflow Demo
This project automates the process of generating data, training, and deploying the modulation recognition model for radio singal classification. The workflow is intended to support experimentation, reproducibility, and deployment of machine learning models for wireless signal modulation classification, such as QPSK, 16-QAM, BPSK,
# Modulation Recognition Demo
## Getting Started
RIA Hub Workflows is an automation platform built into RIA Hub. This project contains an example machine learning
workflow for the problem of signal modulation classification. It also serves as an excellent introduction to
RIA Hub Workflows.
1. Clone the Repository
## 📡 The machine learning development workflow
The development of intelligent radio solutions involves multiple steps:
1. First, we need to prepare a machine learning-ready dataset. This involves signal synthesis or capture, followed by
dataset curation to extract and qualify training examples. Finally, we need to perform any required data preprocessing
—such as augmentation—and split the dataset into training and test sets.
2. Secondly, we need to design and train a model. This is often an iterative process and can leverage techniques like
Neural Architecture Search (NAS) and hyperparameter optimization to automate finding a suitable model structure and
optimal hyperparameter configuration, respectively.
3. Once a machine learning model has been trained and validated, the next step is to build an inference application.
This step transforms the model from a research artifact into a practical tool capable of making predictions in
real-world conditions. Building an inference application typically involves several substeps including model
optimization, packaging and integration, and monitoring and logging.
This is a lot of work, and much of it involves tedious software development and repetitive tasks like setting up and
configuring infrastructure. What's more? There is a shortage of domain expertize in ML and MLOps for radio. That's
where we come in. RIA Hub offers a no- and low-code solution for the end-to-end development of intelligent radio
systems, allowing for a sharper focus on innovation.
## ▶️ RIA Hub Workflows
One of the core principles of RIA Hub is Workflows, which allow users to run jobs in isolated Docker containers.
You can create workflows in one of two ways:
- Writing YAML and placing it in the special `.riahub/workflows/` directory in your repository.
- Using RIA Hub's built-in tools for Dataset Management, Model Building, and Application Development, which will
automatically generate the YAML workflow file(s) for you.
Workflows can be configured to run automatically on push and pull request events. You can monitor and manage running
workflows in the 'Workflows' tab in your repository.
## ⚙️ Qoherent-hosted runners
Qoherent-hosted runners are job containers that Qoherent provides and manages to run your workflows and jobs in RIA Hub
Workflows.
Why use GitHub-hosted runners?
- Easy to set up and start running workflows quickly, without the need to set up your own infrastructure.
- Qoherent maintains runners equipped with access to common hardware and tools for radio ML development, including
SDR testbeds and common embedded targets.
If you want to learn more about the runners we have available, please feel free to reach out. We can also provide
custom runners equipped with specific radio hardware and RAN software upon request.
Want to register your own runner? No problem! Please refer to the RIA Hub documentation for more details.
## 🔍 Modulation Recognition
## 🚀 Getting started
1. Fork the project repo, using the button in the upper right hand corner.
2. Enable Workflows (*Settings → Advanced Settings → Enable Repository Actions*).
3. Check for available runners. The runner management tab can found at the top of the 'Workflows' tab. If no runners
are available, you'll need to register one before proceeding.
4. Clone down the project. For example:
```commandline
git clone https://github.com/yourorg/modrec-workflow.git
git clone https://git.riahub.ai/user/modrec-workflow.git
cd modrec-workflow
```
2. Configure the Workflow
All workflow parameters (data paths, model architecture, training settings) are set in 'conf/app.yaml'
Example:
```commandline
dataset:
input_dir: data/recordings
num_slices: 8
train_split: 0.8
val_split : 0.2
5. Set the workflow runner in `.riahub/workflows/workflow.yaml`. The runner is set on line 13:
```yaml
runs-on: ubuntu-latest
```
**Note:** We recommend running this demo on a GPU-enabled runner. If a GPU runner is not available, you can still run
the workflow, but we suggest reducing the number of training epochs to keep runtime reasonable.
6. (Optional) Configure the workflow. All parameters—including file paths, model architecture, and training
settings—are set in `conf/app.yaml`. Want to jump right in? The default configuration is suitable for getting started.
7. Push changes. This will start the workflow automatically.
8. Inspect the workflow output. You can expand and collapse individual steps to view their terminal output. A check
mark indicates that the step completed successfully.
9. Inspect the workflow artifacts. Additional information on workflow artifacts can be found in the next section.
## Workflow artifacts
The example generates several workflow artifacts, including:
- `dataset`: The training and validation datasets: `train.h5` and `val.h5`, respectively.
- `checkpoints`: Saved model checkpoints. Each checkpoint contains the models learned weights at various
stages of training.
- `onnx-file`: The trained model as an [ONNX](https://onnx.ai/) graph.
- `ort-file`: Model in `.ORT` format, recommended for edge deployments. (`.ORT` files are optimized and serialized
by [ONNX Runtime](https://onnxruntime.ai/) for more efficient loading and execution.)
- `profile-data`: Model execution traces, in JSON format.
- `recordings`: Folder of synthesised signal recordings.
## 🤝 Contribution
We welcome contributions from the community! Whether it's an enhancement, bug fix, or new how-to guide, your
input is valuable. To get started, please [contact us](https://www.qoherent.ai/contact/) directly, we're looking forward to collaborating with
you. 🚀
If you encounter any issues or to report a security vulnerability, please submit a [bug report](https://git.riahub.ai/qoherent/modrec-workflow/issues).
Qoherent is dedicated to fostering a friendly, safe, and inclusive environment for everyone. For more information on
our commitment to diversity, please refer to our [Diversity Statement](https://github.com/qoherent/.github/blob/main/docs/DIVERSITY_STATEMENT.md).
We kindly insist that all contributors review and adhere to our [Code of Conduct](https://github.com/qoherent/.github/blob/main/docs/CODE_OF_CONDUCT.md) and that all code contributors
review our [Coding Guidelines](https://github.com/qoherent/.github/blob/main/docs/CODING.md).
## 🖊️ Authorship
This demonstration was developed by [Liyu Xiao](https://www.linkedin.com/in/liyu-xiao-593176206/) during his summer co-op term at Qoherent.
If you like this project, dont forget to give it a star! ⭐
## 📄 License
This example is **free and open-source**, released under [AGPLv3](https://www.gnu.org/licenses/agpl-3.0.en.html).
Alternative licensing options are available. Alternative licensing options are available. Please [contact us](https://www.qoherent.ai/contact/)
for further details.
### Configure GitHub Secrets
@ -34,29 +179,14 @@ Before running the pipeline, add the following repository secrets in GitHub (Set
Once secrets are configured, you can run the pipeline:
3. Run the Pipeline
Once you update the changes to app.yaml, you can make any push or pull to your repo to start running the workflow
## Artifacts Created
After Successful execution, the workflow produces serveral artifacts in the output
- dataset
- This is a folder containing to .h5 datasets called train and val
- Checkpoints
- Contains saved model checkpoints, each checkpoint includes the models learned weights at various stages of training
- ONNX File
- The ONNX file contains the trained model in a standardized format that allows it to be run efficiently across different platforms and deployment environments.
- JSON Trace File (*json)
- Captures a full trace of model training and inference perfomance for profiling and debugging
- Useful for identifying performance bottlenecks, optimizing resource usage, and tracking metadata
- ORT File (*ort)
- This is an ONNX Runtime (ORT) model file, optimized for fast inference on various platforms
- Why is it Useful?
- You can deploy this file on edge devices, servers or integrate it into the production systems for real-time signal classification
- ORT files are class-platform and allow easy inference acceleration using ONNX Runtime
3.
## How to View the JSON Trace File
- Captures a full trace of model training and inference performance for profiling and debugging
- Useful for identifying performance bottlenecks, optimizing resource usage, and tracking metadata
-
Access this [link](https://ui.perfetto.dev/)
Click on Open Trace File -> Select your specific JSON trace file
Explore detailed visualizations of performance metrics, timelines, and resource usage to diagnose bottlenecks and optimize your workflow.