Welcome to this MongoDB quickstart with Python & MongoEngine.
MongoDB is an open source, non-relational database that uses collections and documents in favour of tables, columns and rows, offering a powerful and flexible way to store persistent data for your projects and appliactions.
MongoEngine is an open source driver/library/ORM for Python and offers a great deal of control & flexibility for working with MongoDB!
Connecting to a Mongo database is as simple as a single function call:
connect("example-database")
Creating a collection and document schema is as simple as creating a class & schema:
class BlogPost(Document):
title = StringField(unique=True, required=True)
content = StringField()
date_published = DateTimeField()
published = BooleanField(default=True)
comments = ListField(ReferenceField(Comment))
author = ReferenceField(Author)
likes = IntField(default=0)
rating = FloatField(default=0.0)
Where BlogPost
is the collection and each new instance of the class is a document.
Saving a document to the database is just a method call away:
blog_post = BlogPost(
title="Hello World!",
content="This is just a test",
date_published=datetime.utcnow(),
author=author,
).save()
Querying & updating a document is a trivial task:
blog_post = BlogPost.objects(title="Hello World!").get()
blog_post.update(
content="This is just an update!",
inc__likes=1
)
Querying an entire collection is a single method call, returning an iterable QuerySet
object:
blog_posts = BlogPost.objects()
for post in blog_posts:
print(post.rating)
To visualize your Mongo data, I highly recommend downloading the official MongoDB data explorer & GUI, Compass.
Getting started
Create a new virtual environment & activate it. We'll call ours env
:
python -m venv env
source env/bin/activate
Installing mongoengine
Install mongoengine with a simple pip install
command:
pip install mongoengine
If you run pip
list, you'll notice pymongo
has also been installed too.
Open up a new terminal and start the mongod
daemon (or however you start MongoDB on your machine!):
mongod
You should see something along the lines of:
2019-03-18T16:31:46.613+0000 I NETWORK [initandlisten] waiting for connections on port 27017
Your database is now up and running and ready to accept connections.
We're going to create a new file called main.py
which will be our simple Python file:
touch main.py
Go ahead and open it in your favourite text editor and follow along!
Connecting to MongoDB
To work with MongoDB, we first need to import mongoengine
. We're going to import everything for now and do some cleanup after:
from mongoengine import *
To connect to a running instance of MongoDB, we use the connect()
function and pass it the name of the database to connect to:
connect("demo-db")
If the database doesn't exist, it will be created.
If you have authentication setup on the database (protected with a username and password), you'll need to pass some additional values to the connect()
function:
connect(
db="demo-db",
username="root",
password="example",
authentication_source="admin",
host="localhost",
port=27017
)
Tip - Use environment variables or a configuration file (outside of source control) to access the database credentials
Document schema
Defining a document schema with Mongoengine is as easy as creating a new class and using some of the Mongoengine fields to validate the document values.
Mongoengine features over 30, pre-defined fields which we can use for validation, including:
BooleanField
- Accepts boolean valuesDateTimeField
- Accepts Datetime valuesDictField
- Accepts a dictionaryEmailField
- Accepts an email stringFloatField
- Accepts float valuesIntFields
- Accepts integer valuesListField
- Accepts a listReferenceField
- Accepts a reference to another documentStringField
- Accepts a string value
In this example, we'll create a new schema for a user of a web application:
We start out by creating a new class which inherits from Document
and create a few class attributes:
class User(Document):
username = StringField()
email = EmailField()
password = BinaryField()
Here, we've used StringField
, EmailField
and BinaryField
as values for their corresponding fields/attributes.
The Mongoengine fields are used to validate the data being passed into the class at creation, throwing an error if it receives an incorrect data type (Which we'll cover shortly!).
To create a dynamic document schema, you could optionally pass DynamicDocument
to the class. DynamicDocuments do not require an explicit schema!
class User(DynamicDocument):
pass
For now, we're going to stick with Document
.
Field arguments
To add further validation to our document, we can supply arguments to the Mongoengine fields, a few of which include:
default
- IfTrue
, no documents in the collection will have the same value for this fieldrequired
- IfTrue
, a value must be providedunique
- IfTrue
, no documents in the collection will have the same value for this fieldchoices
- An iterable (e.g. list, tuple or set) of choices to which the value of this field should be limitedmax_length
- An integer defining the maximum length of the value
Before we add more fields to our class, we're going to import datetime
:
from datetime import datetime
Our user class is going to need a few more fields, some of which are required, some aren't and some should have a default value:
class User(Document):
username = StringField(required=True, unique=True)
email = EmailField(required=True)
password = BinaryField(required=True)
first_name = StringField()
last_name = StringField()
age = IntField()
bio = StringField(max_length=100)
categories = ListField()
admin = BooleanField(default=False)
registered = BooleanField(default=False)
rating = FloatField(default=0.0)
page_views = IntField(default=0)
signed_in = BooleanField(default=False)
last_sign_in = DateTimeField()
date_created = DateTimeField(default=datetime.utcnow)
A few points to note:
username = StringField(required=True, unique=True)
- We've setrequired
andunique
toTrue
as theusername
value must provided and be unique.admin = BooleanField(default=False)
- We've setdefault=False
in theadmin
field as it's something that should be explicitly set asTrue
.bio = StringField(max_length=100)
- We've usedmax_length=100
to set a maximum length of the user bio.date_created = DateTimeField(default=datetime.utcnow)
-default=datetime.utcnow
will create autcnow
datetime object on creation of the document.
You'll also note the use of EmailField
, BinaryField
FloatField
and BooleanField
which we've used to define the data type we expect for their respective field.
If any of the values passed into the class do not match their respective data type, an error will be raised!
Note - You can also reference to other documents in your schema which we'll be covering in the next MongoEngine article.
Let's go ahead and create an instance of our User class and save it to the database.
Saving documents
We need to provide a binary value to the password
field, so for this we'll use the os.urandom
function in the os
library.
We also need to provide a date to the last_sign_in
field, for this we'll need to import datetime
.
import os
from datetime import datetime
To create a new instance of a document, we do exactly that! Passing in any required values as keyword arguments:
user = User(
username="FooBar",
email="foo@bar.com",
password=os.urandom(16),
last_sign_in=datetime.utcnow(),
admin=True
)
To save a document to the database, we call the .save()
method:
user = User(
username="FooBar",
email="foo@bar.com",
password=os.urandom(16),
last_sign_in=datetime.utcnow(),
admin=True
).save()
You could alternatively call .save()
on the instance after constructing it:
user = User(
username="FooBar",
email="foo@bar.com",
password=os.urandom(16),
last_sign_in=datetime.utcnow(),
admin=True
)
user.save()
This is handy if you need to create a new document and update it later on before saving it to the database:
user = User(
username="FooBar",
email="foo@bar.com",
password=os.urandom(16),
last_sign_in=datetime.utcnow()
)
user.registered = True
user.admin = True
user.save()
If we try and save another document to the database with username="FooBar"
, we'll get a NotUniqueError
as we've set unique=True
in the username
field of the document schema!
Querying the database
We have a few options for when it comes to querying the database. In this example, we're going to cover some of the simple methods (We'll cover advanced querying in the next episode)
To query an entire collection, we call the .objects()
method on the collection name:
users = User.objects()
This returns a QuerySet
object which we can then iterate over, containing every document in the collection! For example, to iterate over all the users and print out each users username:
users = User.objects()
for user in users:
print(user.username)
We can use .field_name
to return the value at any given field.
Assuming we have 2 users in our database, JohnDoe
and FooBar
, we get the following output:
JohnDoe
FooBar
To retrieve a specific, unique document from the database, we use the .objects()
method followed by .get()
and provide keyword arguments to .objects()
, for example, to get JohnDoe
from the database:
current_user = User.objects(username="FooBar").get()
This returns an instance of the User
class, of which we can access individual fields with .field_name
.
For example, to access the password
field of the current_user
:
print(current_user.password)
# b'\x7f\x99\xf3\xf3\xc1\x84z\xf7f?\x0f\x9d\xbf\x00\x7f-'
Just remember, the .get()
method will only return a unique object! If you had more than one document matching the query, you'd need to use the .objects()
method without .get()
to return a QuerySet
, which you can then iterate over.
If multiple documents are found using the .get()
method, Mongoengine raises a MultipleObjectsReturned
error.
Tip - Use the
.get()
method when you know the field you're querying is unique.
If a document matching the query doesn't exist, Mongoengine raises a DoesNotExist
error, for example:
current_user = User.objects(username="JaneDoe").get()
# __main__.DoesNotExist: User matching query does not exist
We can handle this with a try
/except
block:
try:
current_user = User.objects(username="JaneDoe").get()
print(current_user.password)
except DoesNotExist:
print("User not found")
We can provide multiple keyword arguments in the .objects()
method to refine our query:
registered_users = User.objects(registered=True, admin=False)
for user in registered_users:
print(user.username)
Again, this returns a QuerySet
object which we can iterate over.
To query the first document in a collection, we can call the .first()
method:
user = User.objects().first()
print(user.username) # FooBar
To limit or skip documents of a collection, we can use a Python slice on the query!:
users = User.objects()[:3] # only return the first 3 documents in the User collection
print(users)
# [<User: User object>, <User: User object>, <User: User object>]
[:3]
- Returns the first documents in the collection from 0 - 3[3:]
- Returns all documents in the collection except the first 3[5:10]
- Returns a slice of documents from 5 - 10
Updating documents
Updating a document is also simple by calling the .update()
method and providing keyword values.
Let's say we want to fetch a user from our database and update it with some additional fields.
We'll start by first getting a reference to the user, followed by calling the .update()
method and passing in some keyword arguments, all wrapped in a try
/except
block:
try:
user = User.objects(username="FooBar").get()
user.update(
age=30,
bio="Explicit is better than implicit"
)
print("User updated!")
except DoesNotExist:
print("User not found")
If you need to work with the document after updating it, you'll need to call the .reload()
method on it:
try:
user = User.objects(username="FooBar").get()
user.update(
age=30,
bio="Explicit is better than implicit"
)
user.reload()
print(user.bio)
except DoesNotExist:
print("User not found")
Atomic updates
Documents can be updated atomically using the update()
and update_one()
methods, coupled with passing in some of several modifiers.
To increment an integer, we can call the .update()
method and pass it inc__field_name=n
, where field_name
is the name of the field to increment and n
is the quantity.
In the following examples, we'll get a reference to the document and print the value before and after updating it.
Let's increment the page_views
field by 1
:
try:
user = User.objects(username="FooBar").get()
print(user.page_views) # 0
User.objects(username="FooBar").update_one(inc__page_views=1)
user.reload()
print(user.page_views) # 1
except DoesNotExist:
print("User not found")
To decrement a value, we do the same with a minor change from inc
to dec
:
try:
user = User.objects(username="FooBar").get()
print(user.page_views) # 1
User.objects(username="FooBar").update_one(dec__page_views=1)
user.reload()
print(user.page_views) # 0
except DoesNotExist:
print("User not found")
To push a value onto a list, we can use the push
modifier, followed by the field name and value:
try:
user = User.objects(username="FooBar").get()
print(user.categories) # []
User.objects(username="FooBar").update_one(push__categories="Python")
user.reload()
print(user.categories) # ["Python"]
except DoesNotExist:
print("User not found")
To push several values to a list, we can use the push_all
modifier, followed by the field name and list of values:
try:
user = User.objects(username="FooBar").get()
print(user.categories) # ["Python"]
User.objects(username="FooBar").update_one(push_all__categories=["MongoDB", "MongoEngine"])
user.reload()
print(user.categories) # ["Python", "MongoDB", "MongoEngine"]
except DoesNotExist:
print("User not found")
To pull a value from a list, we can use the pull
or pull_all
modifier to remove a single item, or multiple items respectively:
try:
user = User.objects(username="FooBar").get()
print(user.categories) # ["Python", "MongoDB", "MongoEngine"]
User.objects(username="FooBar").update_one(pull__categories="Python")
user.reload()
print(user.categories) # ["MongoDB", "MongoEngine"]
except DoesNotExist:
print("User not found")
try:
user = User.objects(username="FooBar").get()
print(user.categories) # ["MongoDB", "MongoEngine"]
User.objects(username="FooBar").update_one(pull_all__categories=["MongoDB", "MongoEngine"])
user.reload()
print(user.categories) # []
except DoesNotExist:
print("User not found")
To add an item to a list, only if it doesn't exist (AKA a set!), we can use the add_to_set
modifier, followed by the field name and value or a list of values respectively:
try:
user = User.objects(username="FooBar").get()
print(user.categories) # []
User.objects(username="FooBar").update_one(add_to_set__categories=["Python", "Python", "Python"])
user.reload()
print(user.categories) # ["Python"]
except DoesNotExist:
print("User not found")
Deleting documents
Deleting a document is as simple as calling the .delete()
method on a document:
try:
user = User.objects(username="FooBar").get()
user.delete()
print("User deleted")
except DoesNotExist:
print("User not found")
You could also call .delete()
directly on the query:
try:
User.objects(username="JohnDoe").delete()
print("User deleted")
except DoesNotExist:
print("User not found")
Tip - Deleting documents which are referenced by other documents can lead to data inconsistency. We'll cover this in the next article on MongoEngine
Wrapping up
This guide was designed as a gentle introduction to working with MongoDB using Python and MongoEngine.
We'll be covering more advanced techniques in further episodes including referencing other documents from a document, reverse deleting rules, advanced queries & more.