Detailed explanation of mongoDB interaction in Python interaction


This article is a mongodb article, including example demonstration, mongodb advanced query, mongodb aggregation pipeline, python interaction, etc.

Advantages of mongoDB
Easy to expand

Large amount of data, high performance

Flexible data model

Install and start

View help: mongod --help
Start the service: sudo service mongod start
Stop the service: sudo service mongod stop
Restart the service: sudo service mongod restart
View process: ps ajx|grep mongod

mongodb database operation
database operations
View the current database: db
View all databases: show dbs /show databases
Switch database: use db_name
Drop the current database: db.dropDatabase()
Collection operations

When the collection does not exist, insert any data collection automatically created.
Or create a collection manually: db.createCollection(name,[options])
where options:

Parameter capped: The default value is false, which means no upper limit is set, and the value is true, which means setting the upper limit.

Parameter size: When the capped value is true, this parameter needs to be specified, indicating the upper limit size. When the document reaches the upper limit, the previous data will be overwritten, in bytes
When the collection exists:
View collections: show collections
Drop a collection: db.collectionname.drop()
mongodb data type
Object ID: Document ID

String: String, most commonly used, must be valid UTF-8

Boolean: stores a boolean value, true or false

Integer: Integer can be 32-bit or 64-bit, it depends on the server

Double: Stores a floating point value

Arrays: Arrays or lists, storing multiple values ​​into one key

Object: for embedded documents, i.e. a value for a document

Null: store Null value

Timestamp: Timestamp, indicating the total number of seconds from 1970-1-1 to the present

Date: UNIX time format to store the current date or time

be careful:

The create date statement is as follows: The format of the parameter is YYYY-MM-DD
new Date('2017-12-20')
Each document has an attribute, _id, to ensure the uniqueness of each document
You can set the _id to insert the document yourself. If it is not provided, then MongoDB provides a unique _id for each document, the type is objectID

objectID is a 12-byte hexadecimal number:
The first 4 bytes are the current timestamp
Next 3 bytes of machine ID
MongoDB's service process id in the next 2 bytes
The last 3 bytes are simple increment values
mongodb data manipulation
Insert data (an error is reported if the field _id exists): db.collection name.insert(document)

Insert data (update if field _id exists): db.collection

Take a chestnut:

#When inserting a document, if the _id parameter is not specified, MongoDB will assign a unique ObjectId to the document

#When inserting a document, you can specify the _id parameter

#Updated the above document whose _id is 1001{_id:"10001",name:"xianyuplus",age:"40"})
Query data: db.collection name.find()
Take a chestnut:

Update data: db.collection name.update( ,,{multi: })
Parameter query: query condition

Parameter update: update operator

Parameter multi: optional, the default is false, which means that only the first record found is updated, and a value of true means that all documents that meet the conditions are updated
Take a chestnut:

Original content:

"_id" : ObjectId("5b66f05f1194e110103bc283"),
"name": "xianyuplus",
"age": "40"

# Replace the value whose name is xianyuplus with xianyuplus1
Contents after operation:

"_id" : ObjectId("5b66f05f1194e110103bc283"),
"name": "xianyuplus1"
It can be seen that simply using update to update the data will cause the original data to be replaced by the new data, so we should use it with $set to specify the key value corresponding to the update.

Take a chestnut:

Original content:
"_id" : ObjectId("5b66f05f1194e110103bc283"),
"name": "xianyuplus",
"age": "40"
# Update the value whose name is xianyuplus to xianyuplus1

Contents after operation:
"_id" : ObjectId("5b66f05f1194e110103bc283"),
"name": "xianyuplus1",
"age": "40"
Update multiple pieces of data: use the parameter multi:true
Take a chestnut:

# Update the name value of all data to xianyuplus1
Note: multi update only works with $ operators, that is, multi only works with $.
Delete data: db.collection name.remove(,{justOne: })

Parameter query: optional, the condition of the deleted document

Parameter justOne: optional, if set to true or 1, only one will be deleted, the default is false, which means to delete multiple
Take a chestnut:

# Delete all the data whose name is xianyuplus
mongodb advanced query
mongodb query method
Query document: db.collection name.find({condition document})
Query a piece of data: db.collection name.findOne({condition document})
Formatted query: db.collection name.find({criteria document}).pretty()
Take a chestnut:

# Query the data whose name is xianyuplus

# Query a piece of data whose name is xianyuplus
mongodb comparison operators
Equal to: as above chestnut
Greater than: $gt ( greater than )
Greater than or equal to: $gte ( greater than equal )
Less than: $lt ( less than )
Less than or equal: $lte ( less than equal )
Not equal: $nt ( not equal )
Take a chestnut:

# Query data with age greater than 20

# Query data whose age is greater than or equal to 20

# Query data whose age is less than 20

# Query data whose age is less than or equal to 20

# Query data whose age is not equal to 20
mongodb logical operators
and: write multiple field conditions in the find condition document
or: use $or
Take a chestnut:

#Find data whose name is xianyuplus and age is 20

#Find data whose name is xianyuplus or age is 20

#Find data whose name is xianyuplus or whose age is greater than 20

#Find data whose age is greater than or equal to 20 or whose gender is male and whose name is xianyuplus
mongodb range operator
Use $in and $nin to determine whether it is within a certain range
Take a chestnut:

#Query data for ages 18 and 28
mongodb uses regular expressions
Write regular expressions using // or $regex
Take a chestnut:

# Query data whose name starts with xian
mongodb paging and skipping
Query the first n pieces of data: db.collection name.find().limit(NUMBER)
Skip n pieces of data: db.setname.find().skip(NUMBER)

Take a chestnut:

#Query the first 3 data

#Query the data after 3 pieces

#skip and limit can be used together to query 4, 5, and 6 pieces of data
mongodb custom query
Use $where to customize the query, here is the js syntax

Take a chestnut:

//Query data with age greater than 30
$where:function() {
return this.age>30;}
mongodb projection
Projection: Display only the content of the data fields you want to see in the query results.

db.collection name.find({},{field name: 1,...})
The field you want to display is set to 1, the field you don't want to display is not set, and the _id field is special, you need to set _id to 0 if you want it not to be displayed.

#Only the name field is displayed in the query result, and age is not displayed
mongodb sort
Sort: db.setname.find().sort({field:1,...})
Set the value of the field to be sorted: ascending order is 1, descending order is -1

Take a chestnut:

#First sort by gender in descending order and then by age in ascending order
mongodb count
Number of statistics: db.collection name.find({condition}).count()
db.collection name.count({condition})
Take a chestnut:

#Query the number of data whose age is 20
#Query the number of data whose age is greater than 20 and whose gender is nan
mongodb deduplication
deduplication : db.collection name.distinct('de-duplication field',{condition})
Take a chestnut:

#Remove data with the same hometown and age greater than 18
mongodb pipeline and aggregation
Aggregate is an aggregation pipeline based on data processing. Each document passes through a pipeline consisting of multiple stages. The pipeline of each stage can be grouped, filtered and other functions, and then processed through a series of output. corresponding results.

Usage: db.collection name.aggregate({pipeline:{expression}})

Commonly used pipes:

$group: Group the documents in the collection, which can be used for statistical results
$match: filter data, only output documents that meet the conditions
$project: Modify the structure of the output document, such as renaming, adding, deleting fields, creating calculation results
$sort: Sort the output documents and output them
$limit: Limit the number of documents returned by the aggregation pipeline
$skip: Skip the specified number of documents and return the rest
$unwind: split the fields of the array type
Common expressions: expression: "column name"

$sum: Calculate the sum, $sum:1 means double count
$avg: Calculate the average
$min: get the minimum value
$max: get the maximum value
$push: Insert values ​​into an array in the resulting document
$first: Get the first document data according to the sorting of resource documents
$last: Get the last document data according to the sorting of resource documents
Aggregate $group
group: group documents for easy counting
Usage: _id indicates grouping basis, _id: "$field name"

Take a chestnut:

#Group by hometown and count
db.xianyu.aggregate({$group:{_id:"$hometown", count:{$sum:1}}})

# Divide all the content in the collection into a group and count the number
db.xianyu.aggregate({$group:{_id:null, count:{$sum:1}}})
Aggregate $project
project: Modify the structure of the input document, such as: rename, add, delete fields, etc.
Take a chestnut:

#Group by hometown and count
#Group output, only display the count field
{$group:{_id:"$hometown", count:{$sum:1}}},
Aggregate $match
match: It is used to filter data and only output documents that meet the conditions. The function is similar to find, but match is a pipeline command, which can send the result to the next pipeline, but find cannot.

Take a chestnut:

#Query age is greater than 20
#Group by hometown and count
#Group output, only display the count field
{$group:{_id:"$hometown", count:{$sum:1}}},
Aggregate $sort
sort: sort the input documents and output them
Take a chestnut:

#Query age is greater than 20
#Group by hometown and count
#Group output, only display the count field
#Sort by count in ascending order
{$group:{_id:"$hometown", count:{$sum:1}}},
Aggregate $limit and $skip
limit: limit the number of documents returned by the aggregation pipeline

skip: skip the specified number of documents and return the remaining documents

Take a chestnut:

#Query age is greater than 20
#Group by hometown and count
#Sort by count in ascending order
#Skip the previous document and return to the second
{$group:{_id:"$hometown", count:{$sum:1}}},
Aggregate $unwind
unwind: Split an array type field in the document into multiple entries, each containing a value in the array

db.collection name.aggregate({$unwind:'$field name'})

Take a chestnut:

{ "_id" : 1, "item" : "t-shirt", "size" : "S" }
{ "_id" : 1, "item" : "t-shirt", "size" : "M" }
{ "_id" : 1, "item" : "t-shirt", "size" : "L" }
Precautions for the use of aggregation
There are several keys in the dictionary corresponding to $group, and there are several keys in the result

The grouping needs to be placed after _id

To get the value of different fields, you need to use $, $gender, $age

$ when taking the value in the dictionary nested in the dictionary

Ability to group by multiple keys at the same time

mongodb index
Usage: db.collection.ensureIndex({property:1}), 1 means ascending order, -1 means descending order

Create a unique index: db.collection.ensureIndex({"property":1},{"unique":true})
Create a unique index and eliminate:

Create a joint index: db.collection.ensureIndex({property:1,age:1})
View all indexes of the current collection: db.collection.getIndexes()
Drop index: db.collection.dropIndex('index name')

mongodb data backup and recovery
mongodb data backup
Backup: mongodump -h dbhost -d dbname -o dbdirectory
-h: server address, you can also specify the port number
-d: The name of the database to be backed up
-o: The backup data storage location, this directory stores the backed up data
mongodb data recovery
Restore: mongorestore -h dbhost -d dbname --dir dbdirectory
-h: server address
-d: The database instance that needs to be restored
--dir: The location of the backup data
mongodb interacts with python
install and import
Install: pip install pymongo
Import the module: from pymongo import MongoClient

Instantiate the object to link to the database. The connection object has two parameters, host and port.

from pymongo import MongoClient
class clientMongo:
def __init__(self):
client = MongoClient(host="", port=27017)
#Use [] brackets to select databases and collections
self.cliention = client["xianyu"]["xianyuplus"]
insert data
Insert a single piece of data: return ObjectId

def item_inser_one(self):
ret = self.cliention.insert({"xianyu":"xianyuplus","age":20})
Insert multiple pieces of data:

def item_insert_many(self):
item_list = [{"name":"xianyuplus{}".format(i)} for i in range(10000)]
items = self.cliention.insert_many(item_list)
Query data
Query a single piece of data:

def item_find_one(self):
ret = self.cliention.find_one({"xianyu":"xianyuplus"})
Query multiple pieces of data:

def item_find_many(self):
ret = self.cliention.find({"xianyu":"xianyuplus"})
for i in ret:
update data
**Update a piece of data:**

def item_update_one(self):
Update all data:

def item_update(self):
delete data
Delete a piece of data:

def item_delete_one(self):
Delete eligible data:

def item_delete_many(self):

end words
The above are some usages of mongodb. The key part is mongo advanced query and aggregation pipeline. You must review it several times to remember. This article is the last article of python database interaction. I hope it will help you.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us