This topic describes how to modify data shards, data sources, field configurations, and index schemas of a table.
Take data shards as an example, the following section shows how to modify a table and make the changes online.
Log on to the OpenSearch console. Go to the details page of the instance that you want to manage and click Table Management in the left-side pane. Find the table that you want to modify and click Edit in the Actions column.
In the Basic Table Information step, adjust the number of data shards:
The number of data shards must be a positive integer and can be up to 256. We recommend that you specify a number that does not exceed three times the number of Searcher workers.
Check the data source information and click Next.
For API data sources, simply click Next. For MaxCompute and OSS data sources, the default setting is to retain the full data source information. If changes are necessary, select the option to modify, enter the new configuration details, then click Data Source Verification followed by Next.
Decide if the index schema requires changes. There are two scenarios:
No change required: select Do Not Change and click Next.Change required: Select Change and change the index schema. Table mode and developer mode are supported.
Decide if the table configurations require changes. There are two scenarios:
No change required: select Do Not Change and click Next.
Change required: Select Change and change the data processing and dictionary configurations.
Data Processing Configuration: By default, each data source is allocated two free data processing resources. To handle large data updates within a short timeframe, you can modify the process_partition_count parameter to expand data processing capabilities.
After you modify the parameter, click Billing Documentation to view the additional resource usage.
Submit your edits to generate a new version:
Dictionary Configuration: This allows users to customize tokenization. If the system's tokenizer does not meet your query tokenization needs, you can configure a custom dictionary for the tokenizer to achieve desired results.
Finalize the edits, complete the full data source details, and click Confirm:
For a MaxCompute data source, you can choose to re-import full data or use current index data.
For an OSS data source, you can choose to re-import full data or use current index data.
For an API data source, you can choose empty data, to re-import full data, or to use current index data.
ImportantTimestamp refers to the duration the new full version of the index can trace back API incremental data, up to 3 days.
For MaxCompute data source tables, the system reindexes by pulling the configured partition data and incremental data based on the specified timestamp.
The empty data index rebuilding method clears previous data and starts tracing real-time data from the specified timestamp. Use this method with caution.
The Restore Data from Index method requires selecting a data restoration version. For details, see the referenced document.
Current index data uses the selected data version to rebuild based on the index schema, avoiding data loss and the complexity of restoring data before rebuilding for API data source tables.
Monitor the progress of the changes on the Change History page:
Upon completion of the table modifications and reindexing, two finite-state machines (FSMs) are created to deploy configurations and manually initiate full indexing. The changes will take effect after the FSM process for data source modifications concludes.