File Processing
- class dynamicdl.processing.jsonfile.JSONFile(form: dict[str | DataType | Static | Generic | Alias, Any] | list[Any])[source]
Bases:
DataFile
The JSONFile class represents an annotation object and has the simplest conversion from the form to parsing. Data essentially follows the dict/list format in Python.
Example:
{ "images": [ { "id": 0, "file_name": "sample.jpg" } ], "categories": [ { "id": 0, "name": "my_class" } ], "annotations": [ { "image_id": 0, "category_id": 0, "bbox": [1.0, 2.0, 3.0, 4.0] } ] }
JSONFile({ 'images': [{ 'id': DT.IMAGE_ID, 'file_name': Generic('{}.jpg', DT.IMAGE_NAME) }], 'categories': Pairing([{ 'id': DT.BBOX_CLASS_ID, 'name': DT.BBOX_CLASS_NAME }], DT.BBOX_CLASS_ID, DT.BBOX_CLASS_NAME), 'annotations': [{ 'image_id': DT.IMAGE_ID, 'category_id': DT.BBOX_CLASS_ID, 'bbox': [DT.XMIN, DT.YMIN, DT.WIDTH, DT.HEIGHT] }] })
Notice how the JSONFile constructor matches exactly the style of the json data, denoting areas which can represent data items respectively.
- class dynamicdl.processing.csvfile.CSVFile(form: Iterable[DataType | Static | Generic | Alias], header: bool = True)[source]
Bases:
DataFile
Utility functions for parsing csv files.
- Parameters:
- class dynamicdl.processing.txtfile.TXTFile(form: dict[str | DataType | Static | Generic | Alias, Any] | list[Any], ignore_type: list[Generic | str] | Generic | str | None = None)[source]
Bases:
DataFile
The TXTFile class is an annotation object notator specifically for .txt file parsing. It also can parse anything that is represented in plaintext, i.e. with UTF-8 encoding. It takes a form similar to any nested dict structure, but it is also dangerous and should be noted that distinct lines must take distinct forms for differentiation and disambiguation.
An example of a txt file that we want to parse:
imageset1 class1 image1 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 image2 2.0 3.0 5.6 2.43 image3 5.4 12.4 543.2 12.3 2.0 3.0 5.6 2.44 2.0 3.0 5.6 2.46 2.0 3.0 5.6 2.48 class2 image4 32.54 21.4 32.43 12.23 image5 imageset2 class1 image6 32.54 21.4 32.43 12.256 classes class1 abc class2 def class3 ghi
Observe that each line can be distinctly classified in a hierarchical sense. That is, each individual line can be attributed to a single purpose.
TXTFile({ Generic('imageset{}', DT.IMAGE_SET_ID): { Generic('class{}', DT.CLASS_ID): { Generic('image{}', DT.IMAGE_ID): [ Generic('{} {} {} {}', DT.X1, DT.X2, DT.Y1, DT.Y2) ] } }, 'classes': Pairing([ Generic('class{} {}', DT.CLASS_ID, DT.CLASS_NAME) ], DT.CLASS_ID, DT.CLASS_NAME) })
Notice the natural structure which is inherited. Each generic ends up distinct from each other, so the dataset is not ambiguous. A hierarchical structure would look as follows:
imageset1 class1 image1 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 image2 2.0 3.0 5.6 2.43 image3 5.4 12.4 543.2 12.3 2.0 3.0 5.6 2.44 2.0 3.0 5.6 2.46 2.0 3.0 5.6 2.48 class2 image4 32.54 21.4 32.43 12.23 image5 imageset2 class1 image6 32.54 21.4 32.43 12.256 classes class1 abc class2 def class3 ghi
Notice that this is exactly the structure reflected in the above code used to parse the file. We can also specify an ignore_type such that any line which matches the Generic or string passed in is skipped.
- Parameters:
- class dynamicdl.processing.xmlfile.XMLFile(form: dict[Static | Generic, Any])[source]
Bases:
DataFile
The XMLFile class represents an annotation object and is similar to the JSONFile class in terms of hierarchical structure and parsing. The one key difference is the needed usage of AmbiguousList over GenericList, as the presence of multiple tags of the same name will be parsed as a list, while tags of one name will be parsed as an item. The algorithm appropriately interprets list objects as AmbiguousList for this exact reason in XMLFile, but if one desires a GenericList it will have to be instantiated manually.
The structure follows suit to the hierarchy, just as in JSONFile. Here is a snippet from the Oxford-IIIT Pets Dataset:
<annotation> <folder>OXIIIT</folder> <filename>Abyssinian_1.jpg</filename> <source> <database>OXFORD-IIIT Pet Dataset</database> <annotation>OXIIIT</annotation> <image>flickr</image> </source> <size> <width>600</width> <height>400</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>cat</name> <pose>Frontal</pose> <truncated>0</truncated> <occluded>0</occluded> <bndbox> <xmin>333</xmin> <ymin>72</ymin> <xmax>425</xmax> <ymax>158</ymax> </bndbox> <difficult>0</difficult> </object> </annotation>
Here we do not specify the extraneous information and get straight to the point:
XMLFile({ "annotation": { "filename": Generic("{}.jpg", DT.IMAGE_NAME), "object": AmbiguousList({ "name": DT.BBOX_CLASS_NAME, "bndbox": { "xmin": DT.XMIN, "ymin": DT.YMIN, "xmax": DT.XMAX, "ymax": DT.YMAX } }) } })
- class dynamicdl.processing.yamlfile.YAMLFile(form: dict[Static | Generic, Any])[source]
Bases:
DataFile
The XMLFile class represents an annotation object and is similar to the JSONFile class in terms of hierarchical structure and parsing.
The structure follows suit to the hierarchy, just as in JSONFile. Here is a snippet from the Tomato Leaf Diseases Dataset:
train: ../train/images val: ../valid/images test: ../test/images nc: 7 names: ['Bacterial Spot', 'Early_Blight', 'Healthy', 'Late_blight', 'Leaf Mold', 'Target_Spot', 'black spot'] roboflow: workspace: sylhet-agricultural-university project: tomato-leaf-diseases-detect version: 3 license: Public Domain
Of particular interest is the names list, in which we need an ImpliedList to set up a pairing between class ID and class name. We do exactly that:
YAMLFile({ 'names': Pairing( ImpliedList([DT.BBOX_CLASS_NAME], indexer=DT.BBOX_CLASS_ID), DT.BBOX_CLASS_NAME, DT.BBOX_CLASS_ID ) })
Image dummy classes.
- class dynamicdl.processing.images.ImageEntry[source]
Bases:
object
Arbitrary image file to be used as a value in the key-value pairing of DynamicDL filestructure formats. It is a dummy object which provides absolute file and image data during processing, and is a marker object to recognize the presence of an image.
- class dynamicdl.processing.images.SegmentationImage[source]
Bases:
object
Arbitrary segmentation image file to be used as a value in the key-value pairing of DynamicDL filestructure formats. It is a dummy object which provides absolute file and segmentation image map data during processing, and is a marker object to recognize the presence of an image.
Module contents
The dynamicdl.processing module handles file processing, including annotation files and image files. These are to be used in describing DynamicDL dataset formats as values following a File key indicator.
Classes:
CSVFile
JSONFile
TXTFile
XMLFile
YAMLFile
ImageEntry
SegmentationImage