OSS provides the data indexing feature to allow you to index the metadata of objects. You can specify the metadata of objects as index conditions to query objects. This way, you can manage and learn about data structures, perform queries, collect statistics, and manage objects in an efficient manner.

Scenarios

To meet data audit or data supervision requirements, you may need to query specific objects from an Object Storage Service (OSS) bucket in which hundreds of millions of objects are stored. An object contains a large volume of metadata, including the name, ETag value, storage class, size, tags, and last modified time of the object. The data indexing feature allows you to combine simple query conditions and data aggregation methods based on your business requirements to improve query performance.

Usage notes

  • Supported regions

    The data indexing feature is supported only in the China (Hangzhou) and Australia (Sydney) region.

  • Billing

    During the public preview, you are not charged for metadata management. For more information about billable items of the data indexing feature, see Data indexing fees.

  • Number of objects

    You can enable metadata management only for buckets in which up to 30 million objects are stored. If you want to enable metadata management for a bucket in which more than 30 million objects are stored, contact technical support.

  • Multipart upload

    If a bucket contains objects that are uploaded by using multipart upload, the query results include only the complete objects combined by calling the CompleteMultipartUpload operation. Parts that are uploaded by multipart upload tasks that have been initiated but are not completed or not canceled are not included in the query results.

Use the OSS console

  1. Log on to the OSS console.
  2. In the left-side navigation pane, click Buckets. On the Buckets page, click the name of the bucket that you want to manage.
  3. In the left-side navigation pane, choose Data Processing and Indexing > Data Indexing.
  4. In the Metadata Management section, enable metadata management.
    The time required for metadata management to take effect varies based on the number of objects in the bucket.
  5. Specify basic conditions to filter objects.
    In the Basic Filtering Conditions section, specify the basic filtering conditions based on your business requirements. The following table describes the basic filtering conditions.
    Filtering condition Description
    Storage Class By default, the following storage classes supported by OSS are selected: Standard, IA, Archive, and Cold Archive. You can also specify the storage class based on your business requirements.
    ACL By default, the following ACLs supported by OSS are selected: Inherited from Bucket, Private, Public Read, and Public Read/Write. You can also specify the ACL based on your business requirements.
    Object Name You can select Fuzzy Match or Equal To. If you want to display the name of an object in the query results, such as exampleobject.txt, you can use one of the following methods to match the object name:
    • Select Equal To and enter the full name of the object. In this example, exampleobject.txt is entered.
    • Select Fuzzy Match and enter the prefix or suffix of the object name. In this example, example or .txt is entered.
      Notice Fuzzy match can match any character in the object name. For example, if you enter test next to Fuzzy Match, localfolder/test/.example.jpg and localfolder/test.jpg meet the query condition, and are displayed in the query results.
    Upload Type By default, the following upload types supported by OSS are selected. You can also specify the upload type based on your business requirements.
    • Normal: The object is uploaded by using simple upload.
    • MultipartUpload: The object is uploaded by using multipart upload.
    • Appendable: The object is uploaded by using append upload.
    • Symlink: The object is a symbolic link that is created to access another object.
    Last Modified At You can specify Start Date and End Date for Last Modified At. Start Date and End Date are accurate to seconds.
    Object Size You can select Equal To, Greater Than, Greater Than or Equal To, Less Than, or Less Than or Equal To. The object size is in KB.
    Object Versions You can query only the current versions of objects.
  6. Optional: Specify other conditions to filter objects.
    If you want to sort objects in the query results or use tags to filter objects, click Show more filtering conditions.
    • Specify the order in which you want to sort objects in the query results.

      In the Object Sort Order section, select Ascending or Descending to sort the objects that you obtained by using the Last Modified At, Object Name, and Object Size filtering conditions that you specified.

    • Specify tag-based filtering conditions.

      In the Tag-based Filtering Conditions section, specify the ETags or tags that you want to use to filter objects.

      • ETags support only exact match. You can enter multiple ETags. Separate multiple ETags with line feeds.
      • Specify Object Tags by using key-value pairs. The keys and values of object tags are case-sensitive. For more information about object tags, see Object tagging.
    • Specify the methods that you want to use to aggregate object data.

      If you want to categorize the query results and collect statistics on each category, you can specify data aggregation methods. For example, you can specify data aggregation methods to collect statistics on the sizes of all objects and obtain the number of distinct storage classes of objects in the query results.