landingai.data_management
LandingLens
LandingLens client
Example
Create a client by specifying API Key and project id
client = LandingLens(project, api_key)
Parameters
project_id: int LandingLens project id. Can override this default in individual commands. api_key: Optional[str] LandingLens API Key. If it's not provided, it will be read from the environment variable LANDINGAI_API_KEY, or from .env file on your project root directory.
Source code in landingai/data_management/client.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 |
|
LegacyTrainingDataset
A client for fetch the training dataset from legacy training flows.
Source code in landingai/data_management/dataset.py
239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 |
|
get_legacy_training_dataset(output_dir, job_id)
Get the training dataset from legacy training flow by job_id. Currently, it only supports segmentation and classification datasets.
Example output of the returned dataframe for a segmentation dataset:
media_id seg_mask_prediction_path seg_mask_label_path
0 10413664 /work/landingai-python/104136... /work/landingai-python/104136...
1 10413665 /work/landingai-python/104136... /work/landingai-python/104136...
2 10413666 /work/landingai-python/104136... /work/landingai-python/104136...
NOTE:
1. This dataset has a similar format as the dataset returned by TrainingDataset.get_training_dataset()
.
2. Only difference is that the prediction mask is thresholded, i.e. the value of each pixel is either 0 or 1.
Example output of the returned dataframe for a classification dataset:
media_id label_class prediction_score prediction_class prediction_type
0 9789913 black_spot 0.992697 black_spot correct
1 9789914 black_spot 0.996753 black_spot correct
... ... ... ... ... ...
1801 9791719 unclassified 0.969400 unclassified correct
1802 9791720 unclassified 0.778278 unclassified correct
Source code in landingai/data_management/dataset.py
TrainingDataset
A client for fetch the (Fast & East) training dataset.
Source code in landingai/data_management/dataset.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
|
get_training_dataset(output_dir, include_image_metadata=False)
Get the most recently used training dataset.
Example output of the returned dataframe:
id split classes seg_mask_prediction_path media_level_predicted_score label_id seg_mask_label_path media_level_label metadata
0 11229595 None [] images/11229595_pred.npy NaN 11301603.0 images/11229595_gt.npy OK {}
1 11229597 None [] images/11229597_pred.npy NaN NaN None None {}
2 9918918 train [screw] images/9918918_pred.npy 0.954456 8792257.0 images/9918918_gt.npy NG {}
3 9918924 dev [screw] images/9918924_pred.npy 0.843393 8792265.0 images/9918924_gt.npy NG {'creator': 'bob'}
4 9918921 train [screw] images/9918921_pred.npy 0.956114 8792260.0 images/9918921_gt.npy NG {}
5 9918923 train [screw] images/9918923_pred.npy 0.943873 8792262.0 images/9918923_gt.npy NG {'creator': 'foo'}
NOTE:
1. Ground truth and prediction masks will be saved to the output_dir as a serialized numpy binary file.
The file name is the media_id with a suffix of "_gt.npy" or "_pred.npy".
You can load the numpy array by calling np.load(file_path)
.
The shape of the numpy array is (height, width, num_classes).
The 0th channel is the first class, the 1th channel is the second class and so on. (The background class is not included.)
-
For prediction masks, the value of each pixel is the confidence score of the class, i.e. it's not thresholded. For ground truth masks, the value of each pixel is either 0 or 1.
-
The serialized mask will an empty numpy array when there is no prediction or ground truth mask. E.g. the ground truth label is OK, i.e. no defect. So be sure to check the shape of the ground truth mask before using it.
-
The training dataset could also include images that are not used for training. Those images will have a None value for below fields: label_id,seg_mask_label_path,media_level_label Tip: for evaluating the model performance, you can filter out those images by checking the label_id field.
-
The split field could be None, train, dev, or test. None means "unassigned" split.
-
The metadata field is a dictionary that contains the metadata associated with each image. It's empty by default. Only available when
include_image_metadata
is True.
Source code in landingai/data_management/dataset.py
Label
Label management API client. This class provides a set of APIs to manage the label of a particular project on LandingLens. For example, you can use this class to list all the available labels for a given project.
Example
client = Label(project_id, api_key) client.get_label_map() {'0': 'ok', '1': 'cat', '2': 'dog'}
Parameters
project_id: int LandingLens project id. Can override this default in individual commands. api_key: Optional[str] LandingLens API Key. If it's not provided, it will be read from the environment variable LANDINGAI_API_KEY, or from .env file on your project root directory.
Source code in landingai/data_management/label.py
get_label_map()
Get all the available labels for a given project.
Returns
Dict[str, str] A dictionary of label index to label name.
Source code in landingai/data_management/label.py
Media
Media management API client. This class provides a set of APIs to manage the medias (images) uploaded to LandingLens. For example, you can use this class to upload medias (images) to LandingLens or list the medias are already uploaded to the LandingLens.
Example
client = Media(project_id, api_key) client.upload("path/to/image.jpg") client.upload("path/to/image_folder") print(client.ls())
Parameters
project_id: int LandingLens project id. Can override this default in individual commands. api_key: Optional[str] LandingLens API Key. If it's not provided, it will be read from the environment variable LANDINGAI_API_KEY, or from .env file on your project root directory.
Source code in landingai/data_management/media.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 |
|
ls(offset=0, limit=1000, media_status=None, **metadata)
List medias with metadata for given project id. Can be filtered using metadata.
NOTE: pagination is applied with the offset
and limit
parameters.
Parameters
offset: int Defaults to 0. As in standard pagination. limit: int Max 1000. Defaults to 1000. As in standard pagination. media_status: Union[str, List] Gets only medias with specified statuses. Defaults to None - then medias with all statuses are fetched. Possible values: raw, pending_labeling, pending_review, rejected, approved **metadata: Kwargs used as metadata that will be used for server side filtering of the results.
Source code in landingai/data_management/media.py
update_split_key(media_ids, split_key)
Update the split key for a list of medias on the LandingLens platform.
Parameters
media_ids: List[int] A list of media ids to update split key. split: str The split key to set for these medias, it could be 'train', 'dev', 'test' or '' (where '' represents Unassigned) and is the default.
Example
client = Media(project_id, api_key) client.update_split_key(media_ids=[1001, 1002], split_key="test") # assign split key 'test' for media ids 1001 and 1002 client.update_split_key(media_ids=[1001, 1002], split_key="") # remove split key for media ids 1001 and 1002
Source code in landingai/data_management/media.py
upload(source, split='', classification_name=None, object_detection_xml=None, seg_mask=None, seg_defect_map=None, nothing_to_label=False, metadata_dict=None, validate_extensions=True, tolerate_duplicate_upload=True, tags=None)
Upload media to platform.
Parameters
source: Union[str, Path, Image]
The image source to upload. It can be a path to the local image file, an
image folder or a PIL Image object. For image files, the supported formats
are jpg, jpeg, png, bmp and tiff.
split: str
Set this media to one split ('train'/'dev'/'test'), '' represents Unassigned
and is the default
classification_name: str
Set the media's classification if the project type is Classification or
Anomaly Detection
object_detection_xml: str
Path to the Pascal VOC xml file for object detection project
seg_mask: str
Path to the segmentation mask file for segmentation project
seg_defect_map: str
Path to the segmentation defect_map.json file for segmentation project.
To get this map, you can use the landingai.data_management.label.Label
API.
See below code as an example.
>>> client = Label(project_id, api_key)
>>> client.get_label_map()
>>> {'0': 'ok', '1': 'cat', '2': 'dog'}
landingai.exceptions.HttpError
if it's a duplicate upload.
Returns
Dict[str, Any] The result from the upload().
# Example output
{
"num_uploaded": 10,
"skipped_count": 0,
"error_count": 0,
"medias": [...],
"files_with_errors": {},
}
Source code in landingai/data_management/media.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
|
Metadata
Metadata management API client. This class provides a set of APIs to manage the metadata of the medias (images) uploaded to LandingLens. For example, you can use this class to update the metadata of the uploaded medias.
Example
client = Metadata(project_id, api_key) client.update([101, 102, 103], creator="tom")
Parameters
project_id: int LandingLens project id. Can override this default in individual commands. api_key: Optional[str] LandingLens API Key. If it's not provided, it will be read from the environment variable LANDINGAI_API_KEY, or from .env file on your project root directory.
Source code in landingai/data_management/metadata.py
get(media_id)
Return all the metadata associated with a given media.
Source code in landingai/data_management/metadata.py
update(media_ids, **input_metadata)
Update or insert a dictionary of metadata for a set of medias.
Parameters
media_ids Media ids to update. input_metadata A dictionary of metadata to be updated or inserted. The key of the metadata needs to be created/registered (for the first time) on LandingLens before calling update().
Returns
Dict[str, Any] The result from the update().
Source code in landingai/data_management/metadata.py
Encoder
Bases: JSONEncoder
JSON encoder that converts all keys to camel case
Source code in landingai/data_management/utils.py
PrettyPrintable
A mix-in class that enables its subclass to be serialized into pretty printed string
Source code in landingai/data_management/utils.py
__repr__()
obj_to_dict(obj)
Convert an object to a json dictionary with camel case keys
obj_to_params(obj)
Convert an object to query parameters in dict format where the dict keys are in camel case.
Source code in landingai/data_management/utils.py
to_camel_case(snake_str)
validate_metadata(input_metadata, metadata_mapping)
Validate the input metadata against the metadata mapping. Raise ValueError if any metadata keys are not available.