Object Storage Service (OSS) provides the data indexing feature to allow you to index the metadata of objects. You can specify the metadata of objects as index conditions to query objects. Data indexing helps you better understand and manage data structures and facilitates queries, statistics, and management of objects.

Scenarios

To meet data audit or data supervision requirements, you may need to query specific objects from an Object Storage Service (OSS) bucket in which hundreds of millions of objects are stored. An object contains a large volume of metadata, including the name, ETag value, storage class, size, tags, and last modified time of the object. The data indexing feature allows you to combine simple query conditions and data aggregation methods based on your business requirements to improve query performance.

Usage notes

  • Supported regions

    The data indexing feature is supported only in the China (Hangzhou) and Australia (Sydney) regions.

  • Billing

    During the public preview, you are not charged for metadata management. For more information about billable items of the data indexing feature, see Data indexing fees.

  • Time required for indexing

    When you enable metadata management, OSS creates an index. The time required for creating the index is directly proportional to the number of objects stored in the bucket. That means, the larger the number of objects in the bucket, the longer the time required to create the index.

  • Multipart upload

    If a bucket contains objects that are uploaded by using multipart upload, the query results include only the complete objects combined by calling the CompleteMultipartUpload operation. Parts that are uploaded by multipart upload tasks that have been initiated but are not completed or not canceled are not included in the query results.

Use the OSS console

  1. Log on to the OSS console.
  2. In the left-side navigation pane, click Buckets. On the Buckets page, click the name of the bucket that you want to manage.
  3. In the left-side navigation pane, choose Data Processing and Indexing > Data Indexing.
  4. In the Metadata Management section, enable metadata management.
    The time required for metadata management to take effect varies based on the number of objects in the bucket.
  5. Specify basic conditions to filter objects.
    In the Basic Filtering Conditions section, specify the basic filtering conditions based on your business requirements. The following table describes the basic filtering conditions.
    Filtering condition Description
    Storage Class By default, the following OSS storage classes are selected: Standard, IA, Archive, and Cold Archive. You can also specify the storage class based on your business requirements.
    ACL By default, the following ACLs supported by OSS are selected: Inherited from Bucket, Private, Public Read, and Public Read/Write. You can also specify the ACL based on your business requirements.
    Object Name You can select Fuzzy Match or Equal To when you specify an object name as a filtering condition. If you want to display the name of an object in the query results, such as exampleobject.txt, you can use one of the following methods to match the object name:
    • Select Equal To and enter the full name of the object. In this example, exampleobject.txt is entered.
    • Select Fuzzy Match and enter the prefix or suffix of the object name. In this example, example or .txt is entered.
      Notice Fuzzy match can match all object names that contain the specified characters. For example, if you enter test next to Fuzzy Match, localfolder/test/.example.jpg and localfolder/test.jpg meet the query condition, and are displayed in the query results.
    Upload Type By default, the following upload types supported by OSS are selected. You can also specify the upload type based on your business requirements.
    • Normal: returns objects uploaded by using simple upload in the query results.
    • Multipart: returns objects uploaded by using multipart upload in the query results.
    • Appendable: returns objects uploaded by using append upload in the query results.
    • Symlink: returns symbolic links.
    Last Modified At You can specify Start Date and End Date for Last Modified At. Start Date and End Date are accurate to seconds.
    Object Size You can select Equal To, Greater Than, Greater Than or Equal To, Less Than, or Less Than or Equal To. The object size is in KB.
    Object Versions You can query only the current versions of objects.
  6. Optional:Specify other conditions to filter objects.
    If you want to sort objects in the query results or use tags to filter objects, click Show more filtering conditions.
    • Specify the order in which you want to sort objects in the query results

      In the Object Sort Order section, select Ascending or Descending to sort the objects by Last Modified At, Object Name, or Object Size.

    • Specify tag-based filtering conditions

      In the Tag-based Filtering Conditions section, specify the ETags or tags that you want to use to filter objects.

      • ETags support only exact match. You can enter multiple ETags. Separate multiple ETags with line feeds.
      • Specify Object Tags by using key-value pairs. The keys and values of object tags are case-sensitive. For more information about object tags, see Object tagging.
    • Specify the methods that you want to use to aggregate object data

      If you want to categorize the query results and collect statistics on each category, you can specify data aggregation methods. For example, you can specify data aggregation methods to collect statistics on the sizes of all objects and obtain the number of distinct storage classes of objects in the query results.