MongoDB can store JSON data compared to traditional relational databases, which is perfect for storing JSON data returned by data crawling. Previously, we introduced the installation of MongoDB on Windows, today we mainly learn to connect to MongoDB using Python and perform the operation of adding, deleting, and checking.

Before connecting to MongoDB, the first thing you need to install is the Python package: PyMongo, which is very easy to install. Just execute pip install pymongo and you are done.

Creating connections

After installing PyMongo, connecting to MongoDB using Python becomes exceptionally easy. The specific way is.

1
2
3
from pymongo import MongoClient

client = MongoClient('localhost', 27017)

Or use the following.

1
2
3
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017')

If the connection contains an account password, etc., please refer to: pymongo.mongo_client.MongoClient

Connecting to the database

The operation of connecting to the database is also very simple. The most important thing is that you don’t need to create a database before you connect to it, if the database exists then you connect directly, if the database doesn’t exist then a new library will be created. The specific way is.

1
2
3
4
# 方式一
db = client.pymongo_test
# 方式二
db = client['pymongo_test']

Either of the above two ways is sufficient.

Collection concept

There is a concept of Collection in MongoDB.I understand it as a namespace, similar to the concept of Scheme in other databases, Collection can be understood as a collection of tables.Collection can be used or not, depending on whether you want to classify the tables under the library. Related operations.

1
2
3
4
#方法一:
collection = db.test_collection
#方法二
collection = db['test-collection']

It is important to know that the collection is created when the first table is created.

Inserting Data

Where the way to insert data is very simple, the longest used method is, insert_one () and insert_many () method, literally you can see that one is to insert a data, the other is to insert multiple data, example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from pymongo import MongoClient

client = MongoClient('localhost', 27017)
db = client.testdb
posts = db.posts

post_1 = {
    'title': 'Python and MongoDB',
    'content': 'PyMongo is fun, you guys',
    'author': 'Scott'
}
post_2 = {
    'title': 'Virtual Environments',
    'content': 'Use virtual environments, you guys',
    'author': 'Scott'
}
post_3 = {
    'title': 'Learning Python',
    'content': 'Learn Python, it is easy',
    'author': 'Bill'
}


#每次插入一条数据
posts.insert_one(post_1)
posts.insert_one(post_2)
posts.insert_one(post_3)

#一次插入多条数据
posts.insert_many([post_1, post_2, post_3])

Querying data

As with inserting data, querying data provides methods to query one or more pieces of data, the methods are find_one() and find() respectively. Example.

1
2
3
4
5
6
7
8
# 查询一条数据
bills_post = posts.find_one({'author': 'Bill'})
print(bills_post)

# 查询多条数据
scotts_posts = posts.find({'author': 'Scott'})
for post in scotts_posts:
    print(post)

In addition, when querying multiple items, you can set the number returned or other qualifications: pymongo.collection.Collection.find

In addition conditions that need to support similar WHERE conditions in relational databases, you need to use specific keywords. Example.

1
2
3
d = datetime.datetime(2009, 11, 12, 12)
for post in posts.find({"date": {"$lt": d}}).sort("author"):
    pprint.pprint(post)

Deleting data

Deleting data is also very simple, the main methods used are: delete_one() and delete_many().

Updating data

The main methods for updating data are, update_one() and update_many() In addition, there is a replace_one() method is used to replace, as not much is used, see the documentation.

Creating indexes

pyMongo also supports create index, which can further improve the performance of queries, example.

1
2
result = db.profiles.create_index([('user_id', pymongo.ASCENDING)],unique=True)
sorted(list(db.profiles.index_information()))

Reference link.