Sitecore MongoDB Blog series: Part 2-Understand scaling, contacts and queries

In previous blog post we have gone through MongoDB introduction with Sitecore, features and installation.In this blog we will go over available scaling options in MongoDB, and then followed with introduction to contacts and out of the box queries.

Scaling:

There are three types of scaling:

  1. Standalone environment
  2. Vertical Scaling and
  3. Horizontal Scaling

Standalone environment:

A standalone is all in one configuration, where we install all xDB components in the same computer, which includes:

  • Content management server
  • Content delivery server
  • Database server
  • Reporting server
  • Collection server.

This is not an optimal production environment setup, and it’s mostly resembles the development environment, where we have all components in the same workstation, we can say this setup as “not scalable environment”.

standalone-setup

Vertical Scaling:

Vertical scaling means adding more resources to single node in the system,which typically involves adding/upgrading more hardware to single machine.

When we start inclining towards Vertical setup, we tend to have separate servers for each component, i.e separate servers for:

  • Database
  • Content management
  • Content delivery and
  • Reporting server

If we see that specific component requires hardware upgrade, then we can just scale that environment/component up, without touching any other server, and this way we can scale the complete Sitecore system.

Horizontal Scaling:

Though we can scale each component of the System, by following vertical Scaling, but what about if we have just one Content delivery server and because of some server issue, we lost all data from that server, just can’t imagine right?

In this specific case even we have scaled up the content delivery server by upgrading the size,RAM and all other component(s) as per the requirements, but such thing can’t help us out if something goes wrong with that specific server, which will ultimately results in data loss.

In this scenario, we can resolve the issue by deploying multiple servers for the same components, which includes:

  • Multiple content management servers
  • Multiple content delivery servers
  • Multiple MongoDB(Analytics) servers
  • Separate session state server.

This type of setup helps in resolving the issue of, one server going down for some reason, From MongoDB presepective, we can achieve this by adding multiple servers for Analytics, we do it via adding Replica sets.

By means of replication we achieve following:

  • Availability
  • MongoDB provides high data availability with replica sets.
  • A replica set consists of two or more copies of the same data.

What happens in Replica set is, we setup the environment which defines a primary server, which will be used to read and write the Analytics  information, at the same time all data from replicaset-1 will get copied to replicaset-2 and replicaset-3, all the servers are always in sync.

From here, if something goes wrong to replicaset-1 server, MongoDB internally makes either replicaset-2 or replicaset-3 as a primary source of reading and writing the information, this we can always make sure data availability.

horizontal-scalibility

Introduction to Contacts:

  • In xDB a contact is an individual visitor.
  • This visitor may be anonymous or he may have been authenticated.
  • A contact is a combination of facets.
  • Contact Includes:
    • Identifiers
    • Personal Information
    • Email
    • Phone Number
    • Addresses

contacts

Identifying Contacts:

  • Contact identification is the process of connecting the current session, device and contact session to an identifier. This is implemented using the Identify() method which is part of the Sitecore Analytics tracker namespace.
  • Sitecore.Analytics.Tracker.Current.Session.Identify(identifier)
  • A contact is always identified by an identifier, identifier is an string value which uniquely identifies a contact in relation to website and this value is always provided by contact itself.
  • Identifiers can be one of the following:
    • User login
    • User id from third party system and/or
    • Email address

Here is the sample snippet which shows how we can validate the use in MongoDB:

mongo-validate-user

MongoDB Queries:

Let’s look into the sample two queries, which is used to fetch data from out of the collections.

Consider a case where we have millions of records in “Contacts” collection, and wants to get specific contact record, we can add a filter where we can pass “FirstName”, and we use “Personal.Firstname” Facet for this.

db.getCollection(‘Contacts’).find({“Personal.FirstName”:”Ankit Joshi”})

FirstNameFilter

Another example, if we want to find an identifier based on specific Id, we can use this query:

db.getCollection(‘Identifiers’).find({“_id”:”ANKIT”})

IdentifyFilter

In the same way we can also create custom collections, and add documents to it using Mongo Shell.

We can create custom collections using Mongo Shell, and the beauty of this is, when we try to create a new collection, and if that collection doesn’t exists it will create it automatically, and documents of the collections can have different structure, which makes it more flexible.

Let me know your feedback and comments if any?

References:

https://doc.sitecore.net/sitecore_experience_platform/setting_up_and_maintaining/xdb/platform/scalability_options

Happy learning 🙂

Advertisements