Home Articles Categories Series
Pythonise Just now
More from this series
Recommended learning

MongoDB with Python & MongoEngine | MongoDB & Python Pt. 1

An introduction to Python & MongoDB with the popular open source driver/ORM - MongoEngine


Article Posted on by in Python
Julian Nash · 8 months ago in Python

Welcome to this MongoDB quickstart with Python & MongoEngine.

MongoDB is an open source, non-relational database that uses collections and documents in favour of tables, columns and rows, offering a powerful and flexible way to store persistent data for your projects and appliactions.

MongoEngine is an open source driver/library/ORM for Python and offers a great deal of control & flexibility for working with MongoDB!

Connecting to a Mongo database is as simple as a single function call:

connect("example-database")

Creating a collection and document schema is as simple as creating a class & schema:

class BlogPost(Document):
    title = StringField(unique=True, required=True)
    content = StringField()
    date_published = DateTimeField()
    published = BooleanField(default=True)
    comments = ListField(ReferenceField(Comment))
    author = ReferenceField(Author)
    likes = IntField(default=0)
    rating = FloatField(default=0.0)

Where BlogPost is the collection and each new instance of the class is a document.

Saving a document to the database is just a method call away:

blog_post = BlogPost(
    title="Hello World!",
    content="This is just a test",
    date_published=datetime.utcnow(),
    author=author,
).save()

Querying & updating a document is a trivial task:

blog_post = BlogPost.objects(title="Hello World!").get()
blog_post.update(
    content="This is just an update!",
    inc__likes=1
)

Querying an entire collection is a single method call, returning an iterable QuerySet object:

blog_posts = BlogPost.objects()
for post in blog_posts:
    print(post.rating)

To visualize your Mongo data, I highly recommend downloading the official MongoDB data explorer & GUI, Compass.

Getting started

Create a new virtual environment & activate it. We'll call ours env:

python -m venv env
source env/bin/activate

Installing mongoengine

Install mongoengine with a simple pip install command:

pip install mongoengine

If you run pip list, you'll notice pymongo has also been installed too.

Open up a new terminal and start the mongod daemon (or however you start MongoDB on your machine!):

mongod

You should see something along the lines of:

2019-03-18T16:31:46.613+0000 I NETWORK  [initandlisten] waiting for connections on port 27017

Your database is now up and running and ready to accept connections.

We're going to create a new file called main.py which will be our simple Python file:

touch main.py

Go ahead and open it in your favourite text editor and follow along!

Connecting to MongoDB

To work with MongoDB, we first need to import mongoengine. We're going to import everything for now and do some cleanup after:

from mongoengine import *

To connect to a running instance of MongoDB, we use the connect() function and pass it the name of the database to connect to:

connect("demo-db")

If the database doesn't exist, it will be created.

If you have authentication setup on the database (protected with a username and password), you'll need to pass some additional values to the connect() function:

connect(
    db="demo-db",
    username="root",
    password="example",
    authentication_source="admin",
    host="localhost",
    port=27017
)

Tip - Use environment variables or a configuration file (outside of source control) to access the database credentials

Document schema

Defining a document schema with Mongoengine is as easy as creating a new class and using some of the Mongoengine fields to validate the document values.

Mongoengine features over 30, pre-defined fields which we can use for validation, including:

  • BooleanField - Accepts boolean values
  • DateTimeField - Accepts Datetime values
  • DictField - Accepts a dictionary
  • EmailField - Accepts an email string
  • FloatField - Accepts float values
  • IntFields - Accepts integer values
  • ListField - Accepts a list
  • ReferenceField - Accepts a reference to another document
  • StringField - Accepts a string value

In this example, we'll create a new schema for a user of a web application:

We start out by creating a new class which inherits from Document and create a few class attributes:

class User(Document):
    username = StringField()
    email = EmailField()
    password = BinaryField()

Here, we've used StringField, EmailField and BinaryField as values for their corresponding fields/attributes.

The Mongoengine fields are used to validate the data being passed into the class at creation, throwing an error if it receives an incorrect data type (Which we'll cover shortly!).

To create a dynamic document schema, you could optionally pass DynamicDocument to the class. DynamicDocuments do not require an explicit schema!

class User(DynamicDocument):
    pass

For now, we're going to stick with Document.

Field arguments

To add further validation to our document, we can supply arguments to the Mongoengine fields, a few of which include:

  • default - If True, no documents in the collection will have the same value for this field
  • required - If True, a value must be provided
  • unique - If True, no documents in the collection will have the same value for this field
  • choices - An iterable (e.g. list, tuple or set) of choices to which the value of this field should be limited
  • max_length - An integer defining the maximum length of the value

Before we add more fields to our class, we're going to import datetime:

from datetime import datetime

Our user class is going to need a few more fields, some of which are required, some aren't and some should have a default value:

class User(Document):
    username = StringField(required=True, unique=True)
    email = EmailField(required=True)
    password = BinaryField(required=True)
    first_name = StringField()
    last_name = StringField()
    age = IntField()
    bio = StringField(max_length=100)
    categories = ListField()
    admin = BooleanField(default=False)
    registered = BooleanField(default=False)
    rating = FloatField(default=0.0)
    page_views = IntField(default=0)
    signed_in = BooleanField(default=False)
    last_sign_in = DateTimeField()
    date_created = DateTimeField(default=datetime.utcnow)

A few points to note:

  • username = StringField(required=True, unique=True) - We've set required and unique to True as the username value must provided and be unique.
  • admin = BooleanField(default=False) - We've set default=False in the admin field as it's something that should be explicitly set as True.
  • bio = StringField(max_length=100) - We've used max_length=100 to set a maximum length of the user bio.
  • date_created = DateTimeField(default=datetime.utcnow) - default=datetime.utcnow will create a utcnow datetime object on creation of the document.

You'll also note the use of EmailField, BinaryField FloatField and BooleanField which we've used to define the data type we expect for their respective field.

If any of the values passed into the class do not match their respective data type, an error will be raised!

Note - You can also reference to other documents in your schema which we'll be covering in the next MongoEngine article.

Let's go ahead and create an instance of our User class and save it to the database.

Saving documents

We need to provide a binary value to the password field, so for this we'll use the os.urandom function in the os library.

We also need to provide a date to the last_sign_in field, for this we'll need to import datetime.

import os
from datetime import datetime

To create a new instance of a document, we do exactly that! Passing in any required values as keyword arguments:

user = User(
    username="FooBar",
    email="foo@bar.com",
    password=os.urandom(16),
    last_sign_in=datetime.utcnow(),
    admin=True
)

To save a document to the database, we call the .save() method:

user = User(
    username="FooBar",
    email="foo@bar.com",
    password=os.urandom(16),
    last_sign_in=datetime.utcnow(),
    admin=True
).save()

You could alternatively call .save() on the instance after constructing it:

user = User(
    username="FooBar",
    email="foo@bar.com",
    password=os.urandom(16),
    last_sign_in=datetime.utcnow(),
    admin=True
)

user.save()

This is handy if you need to create a new document and update it later on before saving it to the database:

user = User(
    username="FooBar",
    email="foo@bar.com",
    password=os.urandom(16),
    last_sign_in=datetime.utcnow()
)

user.registered = True
user.admin = True

user.save()

If we try and save another document to the database with username="FooBar", we'll get a NotUniqueError as we've set unique=True in the username field of the document schema!

Querying the database

We have a few options for when it comes to querying the database. In this example, we're going to cover some of the simple methods (We'll cover advanced querying in the next episode)

To query an entire collection, we call the .objects() method on the collection name:

users = User.objects()

This returns a QuerySet object which we can then iterate over, containing every document in the collection! For example, to iterate over all the users and print out each users username:

users = User.objects()

for user in users:
    print(user.username)

We can use .field_name to return the value at any given field.

Assuming we have 2 users in our database, JohnDoe and FooBar, we get the following output:

JohnDoe
FooBar

To retrieve a specific, unique document from the database, we use the .objects() method followed by .get() and provide keyword arguments to .objects(), for example, to get JohnDoe from the database:

current_user = User.objects(username="FooBar").get()

This returns an instance of the User class, of which we can access individual fields with .field_name.

For example, to access the password field of the current_user:

print(current_user.password)
# b'\x7f\x99\xf3\xf3\xc1\x84z\xf7f?\x0f\x9d\xbf\x00\x7f-'

Just remember, the .get() method will only return a unique object! If you had more than one document matching the query, you'd need to use the .objects() method without .get() to return a QuerySet, which you can then iterate over.

If multiple documents are found using the .get() method, Mongoengine raises a MultipleObjectsReturned error.

Tip - Use the .get() method when you know the field you're querying is unique.

If a document matching the query doesn't exist, Mongoengine raises a DoesNotExist error, for example:

current_user = User.objects(username="JaneDoe").get()
# __main__.DoesNotExist: User matching query does not exist

We can handle this with a try/except block:

try:
    current_user = User.objects(username="JaneDoe").get()
    print(current_user.password)
except DoesNotExist:
    print("User not found")

We can provide multiple keyword arguments in the .objects() method to refine our query:

registered_users = User.objects(registered=True, admin=False)
for user in registered_users:
    print(user.username)

Again, this returns a QuerySet object which we can iterate over.

To query the first document in a collection, we can call the .first() method:

user = User.objects().first()
print(user.username)  # FooBar

To limit or skip documents of a collection, we can use a Python slice on the query!:

users = User.objects()[:3]  # only return the first 3 documents in the User collection
print(users)
# [<User: User object>, <User: User object>, <User: User object>]
  • [:3] - Returns the first documents in the collection from 0 - 3
  • [3:] - Returns all documents in the collection except the first 3
  • [5:10] - Returns a slice of documents from 5 - 10

Updating documents

Updating a document is also simple by calling the .update() method and providing keyword values.

Let's say we want to fetch a user from our database and update it with some additional fields.

We'll start by first getting a reference to the user, followed by calling the .update() method and passing in some keyword arguments, all wrapped in a try/except block:

try:
    user = User.objects(username="FooBar").get()
    user.update(
        age=30,
        bio="Explicit is better than implicit"
    )
    print("User updated!")
except DoesNotExist:
    print("User not found")

If you need to work with the document after updating it, you'll need to call the .reload() method on it:

try:
    user = User.objects(username="FooBar").get()
    user.update(
        age=30,
        bio="Explicit is better than implicit"
    )
    user.reload()
    print(user.bio)
except DoesNotExist:
    print("User not found")

Atomic updates

Documents can be updated atomically using the update() and update_one() methods, coupled with passing in some of several modifiers.

To increment an integer, we can call the .update() method and pass it inc__field_name=n, where field_name is the name of the field to increment and n is the quantity.

In the following examples, we'll get a reference to the document and print the value before and after updating it.

Let's increment the page_views field by 1:

try:
    user = User.objects(username="FooBar").get()
    print(user.page_views)  # 0
    User.objects(username="FooBar").update_one(inc__page_views=1)
    user.reload()
    print(user.page_views)  # 1
except DoesNotExist:
    print("User not found")

To decrement a value, we do the same with a minor change from inc to dec:

try:
    user = User.objects(username="FooBar").get()
    print(user.page_views)  # 1
    User.objects(username="FooBar").update_one(dec__page_views=1)
    user.reload()
    print(user.page_views)  # 0
except DoesNotExist:
    print("User not found")

To push a value onto a list, we can use the push modifier, followed by the field name and value:

try:
    user = User.objects(username="FooBar").get()
    print(user.categories)  # []
    User.objects(username="FooBar").update_one(push__categories="Python")
    user.reload()
    print(user.categories)  # ["Python"]
except DoesNotExist:
    print("User not found")

To push several values to a list, we can use the push_all modifier, followed by the field name and list of values:

try:
    user = User.objects(username="FooBar").get()
    print(user.categories)  # ["Python"]
    User.objects(username="FooBar").update_one(push_all__categories=["MongoDB", "MongoEngine"])
    user.reload()
    print(user.categories)  # ["Python", "MongoDB", "MongoEngine"]
except DoesNotExist:
    print("User not found")

To pull a value from a list, we can use the pull or pull_all modifier to remove a single item, or multiple items respectively:

try:
    user = User.objects(username="FooBar").get()
    print(user.categories)  # ["Python", "MongoDB", "MongoEngine"]
    User.objects(username="FooBar").update_one(pull__categories="Python")
    user.reload()
    print(user.categories)  # ["MongoDB", "MongoEngine"]
except DoesNotExist:
    print("User not found")
try:
    user = User.objects(username="FooBar").get()
    print(user.categories)  # ["MongoDB", "MongoEngine"]
    User.objects(username="FooBar").update_one(pull_all__categories=["MongoDB", "MongoEngine"])
    user.reload()
    print(user.categories)  # []
except DoesNotExist:
    print("User not found")

To add an item to a list, only if it doesn't exist (AKA a set!), we can use the add_to_set modifier, followed by the field name and value or a list of values respectively:

try:
    user = User.objects(username="FooBar").get()
    print(user.categories)  # []
    User.objects(username="FooBar").update_one(add_to_set__categories=["Python", "Python", "Python"])
    user.reload()
    print(user.categories)  # ["Python"]
except DoesNotExist:
    print("User not found")

Deleting documents

Deleting a document is as simple as calling the .delete() method on a document:

try:
    user = User.objects(username="FooBar").get()
    user.delete()
    print("User deleted")
except DoesNotExist:
    print("User not found")

You could also call .delete() directly on the query:

try:
    User.objects(username="JohnDoe").delete()
    print("User deleted")
except DoesNotExist:
    print("User not found")

Tip - Deleting documents which are referenced by other documents can lead to data inconsistency. We'll cover this in the next article on MongoEngine

Wrapping up

This guide was designed as a gentle introduction to working with MongoDB using Python and MongoEngine.

We'll be covering more advanced techniques in further episodes including referencing other documents from a document, reverse deleting rules, advanced queries & more.

References

Last modified · 18 Mar 2019
Did you find this article useful?
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Contents
Loading...