Docstring updates to fix Sphinx build warnings and errors

This commit is contained in:
Michael Luciuk 2025-09-10 12:41:10 -04:00
parent 0d492c59d2
commit 7cd8d3556b
3 changed files with 46 additions and 105 deletions

View File

@ -145,100 +145,85 @@ class RadioDataset(ABC):
classes_to_augment: Optional[str | list[str]] = None,
inplace: Optional[bool] = False,
) -> RadioDataset | None:
"""Supplement the dataset with new examples by applying various transformations to the pre-existing examples
in the dataset.
"""
Supplement the dataset with new examples by applying various transformations
to the pre-existing examples in the dataset.
.. todo::
This method is currently under construction, and may produce unexpected results.
The process of supplementing a dataset to artificially increases the diversity of the examples is called
augmentation. In many cases, training on augmented data can enhance the generalization and robustness of
deep machine learning models. For more information on the benefits and limitations of data
augmentation, please refer to this tutorial by Abid Ali Awan: `A Complete Guide to Data Augmentation
The process of supplementing a dataset to artificially increase the diversity
of examples is called augmentation. Training on augmented data can enhance
the generalization and robustness of deep machine learning models. For more
information, see `A Complete Guide to Data Augmentation
<https://www.datacamp.com/tutorial/complete-guide-data-augmentation>`_.
The metadata for each new example will be identical to the metadata of the pre-existing example from
which it was generated. However, the metadata will be extended to include a 'augmentation' column, which will
be populated for each new example with the string representation of the transform used to generate it, and left
empty for all the pre-existing examples.
Metadata for each new example will be identical to the metadata of the
pre-existing example from which it was generated. The metadata will be
extended to include an 'augmentation' column, populated with the string
representation of the transform used.
Please note that augmented data should only be utilized for model training, not for testing or validation.
Augmented data should only be used for model training, not for testing or
validation.
Unless specified, augmentations are applied equally across classes, maintaining the original class
distribution.
Unless specified, augmentations are applied equally across classes, maintaining
the original class distribution.
In the case where target_size is not equal to the sum of the original class sizes scaled by an integer
multiple, it is not possible to maintain the original class distribution, so the distribution gets slightly
skewed to satisfy target_size. To do this, each class size gets divided by the total size and then
multiplied by target_size, then these values all get rounded to the nearest integers. If the target_size is
not equal to the sum of the rounded sizes, the sizes get sorted based on their decimal portions and then
values are adjusted one by one until the target_size is reached.
If target_size does not match the sum of the original class sizes scaled by
an integer multiple, the class distribution is slightly adjusted to satisfy
target_size.
:param class_key: Class name that is used to augment from and calculate class distribution.
:param class_key: Class name used to augment from and calculate class distribution.
:type class_key: str
:param augmentations: A function or a list of functions that take as input an example from the
dataset and return a transformed version of that example. If no augmentations are specified, the default
augmentations returned by the ``default_augmentations()`` method will be applied.
:param augmentations: A function or list of functions that take an example
and return a transformed version. Defaults to ``default_augmentations()``.
:type augmentations: callable or list of callables, optional
:param level: The level or extent of data augmentation to apply, ranging from 0.0 (no augmentation) to
1.0 (full augmentation, where each augmentation is applied to each pre-existing example).
|br| |br| If ``classes_to_augment`` is specified, this can be either:
* A single float:
All classes are augmented evenly to this level, maintaining the original class distribution.
* A list of floats:
Each element corresponds to the augmentation level target for the corresponding class.
The default is 1.0.
:param level: The extent of augmentation from 0.0 (none) to 1.0 (full). If
``classes_to_augment`` is specified, can be either:
* A single float: All classes augmented evenly to this level.
* A list of floats: Each element corresponds to the augmentation level
target for the corresponding class.
:type level: float or list of floats, optional
:param target_size: Target size of the augmented dataset. If specified, ``level`` is ignored, and augmentations
are applied to expand the dataset to contain the specified number of examples.
If ``classes_to_augment`` is specified, this can be either:
* A single float:
All classes are augmented proportional to their relative frequency until the dataset reaches the
target size, maintaining the original class distribution.
* A list of floats:
Each element in the list corresponds to the target size for the corresponding class.
Defaults to None.
:param target_size: Target size of the augmented dataset. Overrides ``level``
if specified. If ``classes_to_augment`` is specified, can be either:
* A single float: All classes are augmented proportional to their
relative frequency until the dataset reaches target_size.
* A list of floats: Each element corresponds to the target size for the
corresponding class.
:type target_size: int or list of ints, optional
:param classes_to_augment: List of the metadata keys of the classes to augment. If specified, only these
classes will be augmented. Defaults to None.
:param classes_to_augment: List of metadata keys of classes to augment.
:type classes_to_augment: string or list of strings, optional
:param inplace: If True, the augmentation is performed inplace and ``None`` is returned. Defaults to False.
:param inplace: If True, the augmentation is performed inplace and ``None`` is returned.
:type inplace: bool, optional
:raises ValueError: If level has any values that are not in the range (0,1].
:raises ValueError: If level has any values not in the range (0,1].
:raises ValueError: If target_size of dataset is already sufficed.
:raises ValueError: If a class name in classes_to_augment does not exist in the specified class_key.
:raises ValueError: If a class in classes_to_augment does not exist in class_key.
:return: The augmented dataset or None if ``inplace=True``.
:rtype: RadioDataset or None
**Examples:**
>>> from ria.dataset_manager.builders import AWGN_Builder()
>>> from ria.dataset_manager.builders import AWGN_Builder
>>> builder = AWGN_Builder()
>>> builder.download_and_prepare()
>>> ds = builder.as_dataset()
>>> ds.get_class_sizes(class_key='col')
{a:100, b:500, c:300}
{'a': 100, 'b': 500, 'c': 300}
>>> new_ds = ds.augment(class_key='col', classes_to_augment=['a', 'b'], target_size=1200)
>>> new_ds.get_class_sizes(class_key='col')
{a:150 b:750, c:300}
>>> from ria.dataset_manager.builders import AWGN_Builder()
>>> builder = AWGN_Builder()
>>> builder.download_and_prepare()
>>> ds = builder.as_dataset()
>>> ds.get_class_sizes(class_key='col')
{a:50, b:20, c:130}
>>> new_ds = ds.augment(class_key='col', level=0.5)
>>> new_ds.get_class_sizes(class_key='col')
{a:75 b:30, c:195}
{'a': 150, 'b': 750, 'c': 300}
"""
if augmentations is None:

View File

@ -28,7 +28,7 @@ class Recording:
Metadata is stored in a dictionary of key value pairs,
to include information such as sample_rate and center_frequency.
Annotations are a list of :ref:`Annotation <ria_toolkit_oss.datatypes.Annotation>`,
Annotations are a list of :class:`~ria_toolkit_oss.datatypes.Annotation`,
defining bounding boxes in time and frequency with labels and metadata.
Here, signal data is represented as a NumPy array. This class is then extended in the RIA Backends to provide
@ -48,7 +48,7 @@ class Recording:
:param metadata: Additional information associated with the recording.
:type metadata: dict, optional
:param annotations: A collection of ``Annotation`` objects defining bounding boxes.
:param annotations: A collection of :class:`~ria_toolkit_oss.datatypes.Annotation` objects defining bounding boxes.
:type annotations: list of Annotations, optional
:param dtype: Explicitly specify the data-type of the complex samples. Must be a complex NumPy type, such as
@ -444,34 +444,6 @@ class Recording:
else:
raise ValueError(f"Key {key} is protected and cannot be modified or removed.")
def view(self, output_path: Optional[str] = "images/signal.png", **kwargs) -> None:
"""Create a plot of various signal visualizations as a PNG image.
:param output_path: The output image path. Defaults to "images/signal.png".
:type output_path: str, optional
:param kwargs: Keyword arguments passed on to ria_toolkit_oss.view.view_sig.
:type: dict of keyword arguments
**Examples:**
Create a recording and view it as a plot in a .png image:
>>> import numpy
>>> from ria_toolkit_oss.datatypes import Recording
>>> samples = numpy.ones(10000, dtype=numpy.complex64)
>>> metadata = {
>>> "sample_rate": 1e6,
>>> "center_frequency": 2.44e9,
>>> }
>>> recording = Recording(data=samples, metadata=metadata)
>>> recording.view()
"""
from ria_toolkit_oss.view import view_sig
view_sig(recording=self, output_path=output_path, **kwargs)
def to_sigmf(self, filename: Optional[str] = None, path: Optional[os.PathLike | str] = None) -> None:
"""Write recording to a set of SigMF files.
@ -487,22 +459,6 @@ class Recording:
:raises IOError: If there is an issue encountered during the file writing process.
:return: None
**Examples:**
Create a recording and view it as a plot in a `.png` image:
>>> import numpy
>>> from ria_toolkit_oss.datatypes import Recording
>>> samples = numpy.ones(10000, dtype=numpy.complex64)
>>> metadata = {
... "sample_rate": 1e6,
... "center_frequency": 2.44e9,
... }
>>> recording = Recording(data=samples, metadata=metadata)
>>> recording.view()
"""
from ria_toolkit_oss.io.recording import to_sigmf