Update existing documents in a DashVector collection by using the Python SDK.
If the ID of a document to be updated does not exist, the update operation has no effect.
If you update only some fields, the rest of the fields are set to
Noneby default.
Prerequisites
Before you begin, make sure that you have:
A DashVector cluster
An API key
The latest version of the DashVector SDK
API signature
Collection.update(
docs: Union[Doc, List[Doc], Tuple, List[Tuple]],
partition: Optional[str] = None,
async_req: False
) -> DashVectorResponseExamples
Replace
YOUR_API_KEYandYOUR_CLUSTER_ENDPOINTwith your actual credentials.Create a collection named
quickstart. For more information, see the "Example" section of Create a collection.
All examples share this client setup:
import dashvector
from dashvector import Doc
import numpy as np
client = dashvector.Client(
api_key='YOUR_API_KEY',
endpoint='YOUR_CLUSTER_ENDPOINT'
)
collection = client.get(name='quickstart')Update a single document
The simplest way to update a document is with a (id, vector) tuple:
# Update by using a tuple (id, vector)
ret = collection.update(
('2', [0.1, 0.1, 0.1, 0.1])
)Alternatively, use a Doc object for more explicit control:
# Update by using a Doc object
ret = collection.update(
Doc(
id='1',
vector=[0.1, 0.2, 0.3, 0.4]
)
)
# Verify the update succeeded
assert retUpdate a document with fields
Pass a fields dictionary to update metadata alongside the vector. Fields can be predefined in the collection schema or added dynamically as schema-free fields:
# Update vector and fields by using a Doc object
ret = collection.update(
Doc(
id='3',
vector=np.random.rand(4),
fields={
# Predefined fields (types must match the collection schema)
'name': 'zhangsan', 'weight': 70.0, 'age': 30,
# Schema-free fields (str, int, bool, or float)
'anykey1': 'str-value', 'anykey2': 1,
'anykey3': True, 'anykey4': 3.1415926
}
)
)
# Equivalent tuple format: (id, vector, fields)
ret = collection.update(
('4', np.random.rand(4), {'foo': 'bar'})
)Update multiple documents in a batch
Pass a list of Doc objects or tuples to update multiple documents in a single call:
# Batch update by using Doc objects
ret = collection.update(
[
Doc(id=str(i+5), vector=np.random.rand(4)) for i in range(10)
]
)
# Batch update by using tuples (id, vector, fields)
ret = collection.update(
[
('15', [0.2, 0.7, 0.8, 1.3], {'age': 20}),
('16', [0.3, 0.6, 0.9, 1.2], {'age': 30}),
('17', [0.4, 0.5, 1.0, 1.1], {'age': 40})
]
)
# Verify the batch update succeeded
assert retUpdate documents asynchronously
Set async_req=True to run the update without blocking. Call .get() on the returned future to retrieve the result:
# Submit an asynchronous batch update
ret_future = collection.update(
[
Doc(id=str(i+18), vector=np.random.rand(4), fields={'name': 'foo' + str(i)}) for i in range(10)
],
async_req=True
)
# Block until the async update completes
ret = ret_future.get()Update a document with a sparse vector
ret = collection.update(
Doc(
id='28',
vector=[0.1, 0.2, 0.3, 0.4],
sparse_vector={1: 0.4, 10000: 0.6, 222222: 0.8}
)
)Request parameters
Parameter | Type | Default | Description |
| - | One or more documents to update. | |
| Optional[str] | None | Target partition name. |
| bool | False | Set to |
Notes on the
docsparameter:Tuple format: Elements must follow the order
(id, vector)or(id, vector, fields). A tuple is equivalent to a Doc object.Fields: Each field is a key-value pair where the key is a
strand the value isstr,int,bool, orfloat.If the key is predefined during collection creation, the value type must match the predefined type.
If the key is not predefined, the value can be any of the supported types. For more information, see Schema-free.
Response
Returns a DashVectorResponse object:
Parameter | Type | Description | Example |
| int | Status code. For more information, see Status codes. | 0 |
| str | Result message. | success |
| str | Unique request ID. | 19215409-ea66-4db9-8764-26ce2eb5bb99 |