Machine Learning Platform for AI provides a variety of intelligent image processing models for image classification, image recognition, semantic segmentation, and instance segmentation.

Image classification model

  • Overview

    The image classification model uses the Residual Networks (ResNet) framework. For more information, see Deep Residual Learning for Image Recognition. The image classification model is trained by using the ImageNet dataset.

  • Input format
    The input data must be in the JSON format. It contains the image field. The value of the field is the image content that is encoded in the Base64 format.
    {
      "image": "the image content that is encoded in the Base64 format"
    }
  • Output format
    The output format is JSON. The output data contains the following fields.
    Field Description Shape Type
    class The class ID. [] INT32
    class_name The class name. [] STRING
    class_probs The matching probabilities of all classes for the object. [num_classes] Dict[STRING, FLOAT]
    request_id The unique ID of the request. [] STRING
    success Indicates whether the request is successful. [] BOOL
    error_code Request error code. [] INT
    error_msg Request error message. [] STRING
    The following is an example of the output data.
    {
    "class": 3,
    "class_name": "coho4",
    "class_probs": {"coho1": 4.028851974258174e-10,
              "coho2": 0.48115724325180054,
              "coho3": 5.116515922054532e-07,
              "coho4": 0.5188422446937221},
     "request_id": "9ac294a4-f387-4c48-b640-d2c6d41fcbee",
     "success": true
    }
  • Test data

    Download the test data to train the image classification model

Image recognition model

  • Overview

    The image recognition model uses the Faster R-CNN framework. For more information, see Towards Real-Time Object Detection with Region Proposal Networks. The image recognition models is trained by using the COCO dataset.

  • Input format
    The input data must be in the JSON format. It contains the image field. The value of the field is the image content that is encoded in the Base64 format.
    {
      "image": "the image content that is encoded in the Base64 format"
    }
  • Output format
    The output format is JSON. The output data contains the following fields.
    Field Description Shape Type
    detection_boxes The coordinate order of the bounding box [y1, x1, y2, x2] is [top, left, bottom, right]. [num_detections, 4] FLOAT
    detection_scores The detection probability of the object. num_detections FLOAT
    detection_classes The ID of the class to which the object belongs. num_detections INT
    detection_class_names The name of the class to which the object belongs. num_detections STRING
    request_id The unique ID of the request. [] STRING
    success Indicates whether the request is successful. [] BOOL
    error_code Request error code. [] INT
    error_msg Request error message. [] STRING
    The following is an example of the output data.
    {
      "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]],
      "detection_scores": [0.9942291975021362, 0.9940272569656372],
      "detection_classes": [1, 1],
      "detection_classe_names": ["text", "text"],
      "request_id": "9ac294a4-f387-4c48-b640-d2c6d41fcbee",
      "success": true
     }
  • Test data

    Download the test data to train the image recognition model

Semantic segmentation model

  • Overview

    The semantic segmentation model uses the DeepLab V3 framework. For more information, see Rethinking Atrous Convolution for Semantic Image Segmentation. The semantic segmentation model is trained by using the Pascal_Voc dataset.

  • Input format
    The input data must be in the JSON format. It contains the image field. The value of the field is the image content that is encoded in the Base64 format.
    {
      "image": "the image content that is encoded in the Base64 format"
    }
  • Output format
    The output format is JSON. The output data contains the following fields.
    Field Description Shape Type
    probs The segmentation probability of the pixel. [output_height, output_width] FLOAT
    preds The ID of the class to which the pixel belongs. [output_height, output_widths] INT
    request_id The unique ID of the request. [] STRING
    success Indicates whether the request is successful. [] BOOL
    error_code Request error code. [] INT
    error_msg Request error message. [] STRING
    The following is an example of the output data.
    {
      "probs" : [[[0.8, 0.8], [0.6, 0.7]],[[0.8, 0.5], [0.4, 0.3]]],
      "preds" : [[1,1], [0, 0]],
       "request_id": "9ac294a4-f387-4c48-b640-d2c6d41fcbee",
       "success": true
    }
  • Test data

    Download the test data to train the semantic segmentation model

Instance segmentation model

  • Overview

    The instance segmentation model uses the Mask R-CNN framework. For more information, see Mask R-CNN. The instance segmentation model is trained by using the COCO dataset.

  • Input format
    The input data must be in the JSON format. It contains the image field. The value of the field is the image content that is encoded in the Base64 format.
    {
      "image": "the image content that is encoded in the Base64 format"
    }
  • Output format
    The output format is JSON. The output data contains the following fields.
    Field Description Shape Type
    detection_boxes The coordinate order of the bounding box [y1, x1, y2, x2] is [top, left, bottom, right]. [num_detections, 4] FLOAT
    detection_scores The detection probability of the instance. num_detections FLOAT
    detection_classes The ID of the class to which the instance belongs. num_detections INT
    detection_class_names The name of the class to which the instance belongs. num_detections STRING
    detection_masks The mask of the instance. [num_detections, image_height, image_width] BOOL
    request_id The unique ID of the request. [] STRING
    success Indicates whether the request is successful. [] BOOL
    error_code Request error code. [] INT
    error_msg Request error message. [] STRING
    The following is an example of the output data.
    {
      "detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]],
      "detection_scores": [0.9942291975021362, 0.9940272569656372],
      "detection_classes": [1, 1],
      "detection_classe_names": ["text", "text"],
      "detection_masks": [[[1,1], [0, 0]], [[0,1], [1, 1]]],
       "request_id": "9ac294a4-f387-4c48-b640-d2c6d41fcbee",
       "success": true
     }
  • Test data

    Download the test data to train the instance segmentation model