MongoDB

Question : List out main feature of the mongoDB
Answer :
Indexing
MongoDB supports generic secondary indexes and provides unique, compound, geospatial, and full-text indexing capabilities as well. Secondary indexes on hierarchical structures such as nested documents and arrays are also supported and enable developers to take full advantage of the ability to model in ways that best suit their applications.

Aggregation
MongoDB provides an aggregation framework based on the concept of data processing pipelines. Aggregation pipelines allow you to build complex analytics engines by processing data through a series of relatively simple stages on the server side and with the full advantage of database optimizations.

Special collection and index types
MongoDB supports time-to-live (TTL) collections for data that should expire at a certain time, such as sessions and fixed-size (capped) collections, for holding recent data, such as logs. MongoDB also supports partial indexes limited to only those documents matching a criteria filter in order to increase efficiency and reduce the amount of storage space required.

File storage
MongoDB supports an easy-to-use protocol for storing large files and file metadata.

Question : What is the Document?
Answer : At the heart of MongoDB is the document: an ordered set of keys with associated values. The representation of a document varies by programming language, but most languages have a data structure that is a natural fit, such as a map, hash, or dictionary. In JavaScript, for example, documents are represented as objects:
{"greeting" : "Hello, world!"}
This simple document contains a single key, "greeting", with a value of "Hello, world!". Most documents will be more complex than this simple one and often will contain multiple key/value pairs:
{"greeting" : "Hello, world!", "views" : 3}

Question : Describe keys are per the mogodb.
Answer : The keys in a document are strings. Any UTF-8 character is allowed in a key, with a few notable exceptions:
Keys must not contain the character \0 (the null character). This character is used to signify the end of a key.
The . and $ characters have some special properties and should be used only in certain circumstances, as described in later chapters. In general, they should be considered reserved, and drivers will complain if they are used inappropriately.

Question : What is collection?
Answer : A collection is a group of documents. If a document is the MongoDB analog of a row in a relational database, then a collection can be thought of as the analog to a table.

Question : Are collections schemas are flexible or hard to modify?
Answer : Collections have dynamic schemas. This means that the documents within a single collection can have any number of different “shapes.” For example, both of the following documents could be stored in a single collection:

{"greeting" : "Hello, world!", "views": 3}
{"signoff": "Good night, and good luck"}

Question : “Why do we need separate collections at all?”
Answer : Keeping different kinds of documents in the same collection can be a nightmare for developers and admins. Developers need to make sure that each query is only returning documents of a certain type or that the application code performing a query can handle documents of different shapes. If we’re querying for blog posts, it’s a hassle to weed out documents containing author data.
It is much faster to get a list of collections than to extract a list of the types in a collection. For example, if we had a "type" field in each document that specified whether the document was a “skim,” “whole,” or “chunky monkey,” it would be much slower to find those three values in a single collection than to have three separate collections and query the correct collection.
Grouping documents of the same kind together in the same collection allows for data locality. Getting several blog posts from a collection containing only posts will likely require fewer disk seeks than getting the same posts from a collection containing posts and author data.
We begin to impose some structure on our documents when we create indexes. (This is especially true in the case of unique indexes.) These indexes are defined per collection. By putting only documents of a single type into the same collection, we can index our collections more efficiently.

Question : Explain Database?
Answer : Databases
In addition to grouping documents by collection, MongoDB groups collections into databases. A single instance of MongoDB can host several databases, each grouping together zero or more collections. A database has its own permissions, and each database is stored in separate files on disk. A good rule of thumb is to store all data for a single application in the same database. Separate databases are useful when storing data for several application or users on the same MongoDB server.

Question : What is the limitation while naming the database name?
Answer : Here are the constraints for name the database:
The empty string ("") is not a valid database name.
A database name cannot contain any of these characters: /, \, ., ", *, <, >, :, |, ?, $, (a single space), or \0 (the null character). Basically, stick with alphanumeric ASCII.
Database names are case-sensitive, even on non-case-sensitive filesystems. To keep things simple, try to just use lowercase characters.
Database names are limited to a maximum of 64 bytes.

Question : List out the default databases and there usage?
Answer : There are also several reserved database names, which you can access but which have special semantics. These are as follows:

admin : The admin database plays a role in authentication and authorization. In addition, access to this database is required for some administrative operations. See [Link to Come] for more information about the admin database.

local : This database stores data specific to a single server. In replica sets, local stores data used in the replication process. The local database itself is never replicated. See Chapter 9 for more information about replication and the local database).

config : Sharded MongoDB clusters, use the config database to store information about each shard.

Question : Give brief info about MongoDB Shell?
MongoDB comes with a JavaScript shell that allows interaction with a MongoDB instance from the command line. The shell is useful for performing administrative functions, inspecting a running instance, or just exploring MongoDB. The mongo shell is a crucial tool for using MongoDB. We use the mongo shell extensively throughout the rest of the text.

Running the Shell
To start the shell, run the mongo executable:

$ mongo
MongoDB shell version: 3.3.5
connecting to: test
>
The shell automatically attempts to connect to a MongoDB server on startup, so make sure you start mongod before starting the shell.

Question : How to create collection from the mongo shell?
Answer : We can use db.createCollection("temp");

Question : How can you insert a record in collection?
Answer : db.temp.insert({id : 1});

Question : Find All the documents based on query?
Answer : db.temp.find()

Question : Find One of the documents based on query?
Answer : db.temp.findOne();

Question : Update a record based on the given query?
Answer : db.temp.update({},{ $set : { id : 2} })

Question : Update multiple documents?
Answer : We have to use the multi : true as the option to update the same update on the all the selected document which matches with the query
db.temp.update({},{ $set : { id : 2} }, {multi : true})

Question : How can you delete a document from collection ?
Answer : db.temp.deleteOne({id:2});
There are other methods like deleteMany, findOneAndDelete and FindOneAndReplace.

Question : What is _id and ObjectIds?
Answer : Every document stored in MongoDB must have an "_id" key. The "_id" key’s value can be any type, but it defaults to an ObjectId. In a single collection, every document must have a unique value for "_id", which ensures that every document in a collection can be uniquely identified. That is, if you had two collections, each one could have a document where the value for "_id" was 123. However, neither collection could contain more than one document with an "_id" of 123.

Question : What is so diffrent about the traditional keys vs mongodb ObjectId?
Answer : ObjectId is the default type for "_id". The ObjectId class is designed to be lightweight, while still being easy to generate in a globally unique way across different machines. MongoDB’s distributed nature is the main reason why it uses ObjectIds as opposed to something more traditional, like an autoincrementing primary key: it is difficult and time-consuming to synchronize autoincrementing primary keys across multiple servers. Because MongoDB was designed to be a distributed database, it was important to be able to generate unique identifiers in a sharded environment.

Question : How objectIds are generated?
Answer : The 12 bytes of an ObjectId are generated as follows:

0 1 2 3 4 5 6 7 8 9 10 11
Timestamp Machine PID Increment

The first four bytes of an ObjectId are a timestamp in seconds since the epoch. This provides a couple of useful properties:
The timestamp, when combined with the next five bytes (which will be described in a moment), provides uniqueness at the granularity of a second.
Because the timestamp comes first, it means that ObjectIds will sort in roughly insertion order. This is not a strong guarantee but does have some nice properties, such as making ObjectIds efficient to index.
In these four bytes exists an implicit timestamp of when each document was created. Most drivers expose a method for extracting this information from an ObjectId.

Question : How to get the version information of the db??
Answer : db.version();

Question : Describe the insertMany()?
Answer : If you need to insert multiple documents into a collection, you can use insertMany. This method enables you to pass an array of documents to the database. This is far more efficient because your code will not make a round a round trip to the database for each document inserted, but will insert them in bulk.
db.temp.insertMany([
{"id": 5},
{"id": 6},
{"id": 7}
]);
Output for the above command

{
"acknowledged" : true,
"insertedIds" : [
ObjectId("59a477fb97347a91d48405f4"),
ObjectId("59a477fb97347a91d48405f5"),
ObjectId("59a477fb97347a91d48405f6")
]
}

Sending dozens, hundreds, or even thousands of documents at a time can make inserts significantly faster. insertMany is useful if you are inserting multiple documents into a single collection.

Question : What is insert() and are suppose to use it or not?
Answer : insert()
In versions of MongoDB prior to 3.0, insert() was the primary method for inserting documents into MongoDB. MongoDB drivers introduced a new CRUD API at the same time as the MongoDB 3.0 server release. As of MongoDB 3.2 the mongo shell also supports this API, which includes insertOne and insertMany as well as several other methods. The goal of the current CRUD API is to make the semantics of all CRUD operations consistent and clear across the drivers and the shell. While methods such as insert() are still supported for backward compatibility, they should not be used in applications going forward. You should instead prefer insertOne and insertMany for creating documents.

Question : Does drop command, delete the indexes as well?
Answer : Yes the drop command will clean the indexes as well as the data, to persist the indexes and still want to clean the data, use the deleteMany command.

Question : How can we remove a key value pair or precisely a key from the document(s)?
Answer : need to use $unset command, with the update operation, it's opposite of the $set command.

Question : how can we increment the value for a give key ?
Answer : The $inc operator can be used to change the value for an existing key or to create a new key if it does not already exist. It is very useful for updating analytics, karma, votes, or anything else that has a changeable, numeric value.
db.temp.updateMany({},{$inc : { id : 5}});

Question : How can we decrement the value?
Answer : We need to pass the negative value to the $inc operator.
db.temp.updateMany({}, {$inc : {id : -1}})

Question : What is $push operator?
Answer : "$push" adds elements to the end of an array if the array exists and creates a new array if it does not. For example, suppose that we are storing blog posts and want to add a "comments" key containing an array. We can push a comment onto the nonexistent "comments" array, which will create the array and add the comment.
db.temp.update({}, {$push : { user : { firstName : "vikash", lastName : "Mishra" }}});

Question : How can we insert multiple records with the single $push?
Answer : We need to use the $each alongside with the $push , e.g. we want to add all the gadget for an individual
e.g. db.temp.update({name:"ARCHIT"}, {$push : { gadgets : { $each : ["Ipad","Samsung VR","Iphone","Fitbit","Bluetooth Headphone"] }}});

Question : What is the $addToSet?
Answer : $addToSet can be used for the cases when we want to avoid any kind of duplicate entry for the array.
db.temp.update({}, {$addToSet : { gadgets : { $each : ["Ipad","Samsung VR","Iphone","Fitbit","Bluetooth Headphone","Ipod"] }}});

Question : What is the $pop?
Answer : it will remove the element from the top or bottom from an array.
db.temp.update({}, {$pop : {gadgets : 1}});
db.temp.update({}, {$pop : {gadgets : -1}});
{"$pop" : {"key" : 1}} removes an element from the end of the array. {"$pop" : {"key" : -1}} removes it from the beginning.

Question : How to remove a specific element from an array,
db.temp.update({}, {$pull : {gadgets : "Fitbit"}});

Question : Positional operator to the modify the element in the array?
Answer : In many cases, though, we don’t know what index of the array to modify without querying for the document first and examining it. To get around this, MongoDB has a positional operator, $, that figures out which element of the array the query document matched and updates that element. For example, if we have a user named John who updates his name to Jim, we can replace it in the comments by using the positional operator:
db.temp.updateMany({"user.lastName":"Mishra"},{$set : {$each : {"user.$.lastName" : "Chandra"}}});
The positional operator updates only the first match. Thus, if John had left more than one comment, his name would be changed only for the first comment he left.

Question : Conditional operators
Answer : "$lt", "$lte", "$gt", and "$gte" are all comparison operators, corresponding to <, <=, >, and >=, respectively. They can be combined to look for a range of values. For example, to look for users who are between the ages of 18 and 30, we can do this:
db.users.find({"age" : {"$gte" : 18, "$lte" : 30}})
This would find all documents where the "age" field was greater than or equal to 18 AND less than or equal to 30.

To query for documents where a key’s value is not equal to a certain value, you must use another conditional operator, "$ne", which stands for “not equal.” If you want to find all users who do not have the username “joe,” you can query for them using this:

> db.users.find({"username" : {"$ne" : "joe"}})
"$ne" can be used with any type.

Question : example of $in and $nin?
Answer :
db.temp.find({"id" : {"$in" : [725, 542, 390]}});
db.temp.find({"id" : {"$nin" : [725, 542, 390]}})

Question : What is $elemMatch?
Answer : The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.

db.temp.find(
{ results: { $elemMatch: { $gt : 88} } }
)

Question : How to create indexes?
Answer : Creating an Index
Try creating an index on the username field. To create an index, we’ll use the createIndex() collection method.

> db.users.createIndex({"username" : 1})

Question : What is hint()?
Answer : The cursor hint() method enables us to specify a particular index to use, either by specifying its shape or its name. If we change our query slightly to use hint, as in the following example, the explain output will be quite different.
db.students.find({student_id:{$gt:500000}, class_id:54})
.sort({student_id:1})
.hint({class_id:1})
.explain("executionstats")

Question : What is the benefit of using the $or in terms of indexes.
Answer : OR QUERIES
Fact is MongoDB can only use one index per query. That is, if you create one index on {"x" : 1} and another index on {"y" : 1} and then do a query on {"x" : 123, "y" : 456}, MongoDB will use one of the indexes you created, not use both. The only exception to this rule is "$or". "$or" can use one index per $or clause, as $or preforms two queries and then merges the results.
In general, doing two queries and merging the results is much less efficient than doing a single query; thus, whenever possible, prefer "$in" to "$or".

Question : Can we put indexes on the sub documents.
Answer : We could put an index on one of the subfields of "x", say "x.subX", to speed up queries using that field:

> db.users.ensureIndex({"x.subX" : 1})
You can go as deep as you’d like with these: you could index "x.y.z.w.a.b.c" (and so on) if you wanted.

Note that indexing the embedded document itself ("loc") has very different behavior than indexing a field of that embedded document ("loc.city"). Indexing the entire subdocument will only help queries that are querying for the entire subdocument. In the example above, the query optimizer could only use an index on "loc" for queries that described the whole subdocument with fields in the correct order (e.g., db.users.find({"loc" : {"ip" : "123.456.789.000", "city" : "Shelbyville", "state" : "NY"}}})). It could not use the index for queries that looked like db.users.find({"loc.city" : "Shelbyville"}).

When Not to Index
Indexes are most effective at retrieving small subsets of data and some types of queries are faster without indexes. Indexes become less and less efficient as you need to get larger percentages of a collection because using an index requires two lookups: one to look at the index entry and one following the index’s pointer to the document. A table scan only requires one: looking at the document. In the worst case (returning all of the documents in a collection) using an index would take twice as many lookups and would generally be significantly slower than a collection scan.

Unfortunately, there isn’t a hard-and-fast rule about when an index helps and when it hinders as it really depends on the size of your data, size of your indexes, size of your documents, and the average result set size (Table 5-1). As a rule of thumb: if a query is returning 30% or more of the collection, start looking at whether indexes or table scans are faster. However, this number can vary from 2% to 60%.

Question : What is the way to create unique index?
Answer : Unique indexes guarantee that each value will appear at most once in the index. For example, if you want to make sure no two documents can have the same value in the "username" key, you can create a unique index:
> db.users.ensureIndex({"username" : 1}, {"unique" : true})

Question : How can you drop duplicates, for the existing collections which already have the duplicate data?
Answer : db.people.ensureIndex({"username" : 1}, {"unique" : true, "dropDups" : true});

Question : What is Sparse Indexes?
As we know unique indexes count null as a value, so you cannot have a unique index with more than one document missing the key. However, there are lots of cases where you may want the unique index to be enforced only if the key exists. If you have a field that may or may not exist but must be unique when it does, you can combine the unique option with the sparse option.

To create a sparse index, include the sparse option. For example, if providing an email address was optional but, if provided, should be unique, we could do:
> db.users.ensureIndex({"email" : 1}, {"unique" : true, "sparse" : true});

Sparse indexes do not necessarily have to be unique. To make a non-unique sparse index, simply do not include the unique option.

Question : How can we drop the indexes?
Answer : You can remove unneeded indexes using the dropIndex command:
db.people.dropIndex("x_1_y_1")

Question : How can you make sure that, your database remains responsive while you create an index?
Answer : Building new indexes is time-consuming and resource-intensive. By default, MongoDB will build an index as fast as possible, blocking all reads and writes on a database until the index build has finished. If you would like your database to remain somewhat responsive to reads and writes, use the background option when building an index. This forces the index build to occasionally yield to other operations, but may still have a severe impact on your application (see [Link to Come] for more information). Background indexing is also much slower than foreground indexing.

Question : creating a index on stable data is faster or new data?
Answer : If you have the choice, creating indexes on existing documents is slightly faster than creating the index first and then inserting all documents.

Question : How can you Create a Text Index?
Answer : Suppose we have a collection of Wikipedia articles that we want to index. To run a search over the text, we first need to create a "text" index. The following call to createIndex() will create a text index based on the terms in both the "title" and "body" fields.
> db.articles.createIndex({"title": "text, "body" : "text"});

This is not like “nomal” multikey indexes where there is an ordering on the keys. By default, each field is given equal consideration in text indexes. You can control the relative importance MongoDB attaches to each field by specifying a weight:

> db.articles.createIndex({"title": "text", "body": "text"}, {"weights" : { "title" : 3, "body" : 2}});

Question : How can we do Text Search, for the text index?
Answer : Use the $text query operator to perform text searches on a collection with a text index. $text will tokenize the search string using whitespace and most punctuation as delimiters, and perform a logical OR of all such tokens in the search string. For example, you could use the following query to find all articles containing any of the terms from the list “crater”, “meteor”, “moon”. Note that because our index is based on terms in both the title and body of an article, this query will match documents in which those terms are found in either field. For purposes of this example, we will project the title so that we can fit more results on this page of text.
> db.articles.find({$text: {$search: "impact crater lunar"}}, {title: 1}).limit(10)

Question : Explain Creating Capped Collections?
Answer : Unlike normal collections, capped collections must be explicitly created before they are used. To create a capped collection, use the create command. From the shell, this can be done using createCollection:
> db.createCollection("my_collection", {"capped" : true, "size" : 100000});
The previous command creates a capped collection, my_collection, that is a fixed size of 100,000 bytes.
createCollection can also specify a limit on the number of documents in a capped collection in addition to the limit size:
> db.createCollection("my_collection2", {"capped" : true, "size" : 100000, "max" : 100});
You could use this to keep, say, the latest 10 news articles or limit a user to 1,000 documents.

Question : What is Time-To-Live (TTL) Indexes?
As mentioned in the previous section, capped collections give you limited control over when their contents are overwritten. If you need a more flexible age-out system, time-to-live (TTL) indexes allow you to set a timeout for each document. When a document reaches a preconfigured age, it will be deleted. This type of index is useful for caching use cases such as session storage.

You can create a TTL index by specifying the expireAfterSeconds option in the second argument to createIndex:

> // 24-hour timeout
> db.sessions.createIndex({"lastUpdated" : 1}, {"expireAfterSeconds" : 60*60*24})
This creates a TTL index on the "lastUpdated" field. If a document’s "lastUpdated" field exists and is a date, the document will be removed once the server time is expireAfterSeconds seconds ahead of the document’s time.

To prevent an active session from being removed, you can update the "lastUpdated" field to the current time whenever there is activity. Once "lastUpdated" is 24 hours old, the document will be removed.

MongoDB sweeps the TTL index once per minute, so you should not depend on to-the-second granularity. You can change the expireAfterSeconds using the collMod command:

> db.runCommand( {"collMod" : "someapp.cache" , "index" : { "keyPattern" :
{"lastUpdated" : 1} , "expireAfterSeconds" : 3600 } } );

Question : Compare Embedding VS References :
Answer :
Embedding is better for... References are better for...
Small subdocuments Large subdocuments
Data that does not change regularly Volatile data
When eventual consistency is acceptable When immediate consistency is necessary
Documents that grow by a small amount Documents that grow a large amount
Data that you’ll often need to perform Data that you’ll often exclude from the results
a second query to fetch
Fast reads Fast writes

Question : When Not to Use MongoDB
Answer : While MongoDB is a general-purpose database that works well for most applications, it isn’t good at everything. Here are some tasks that MongoDB is not designed to do:
MongoDB does not support transactions, so systems that require transactions should use another data store. There are a couple of ways to hack in simple transaction-like semantics, particularly on a single document, but there is no database enforcement. Thus, you can make all of your clients agree to obey whatever semantics you come up with (e.g., “Check the "locks" field before doing any operation”) but there is nothing stopping an ignorant or malicious client from messing things up.
Joining many different types of data across many different dimensions is something relational databases are fantastic at. MongoDB isn’t supposed to do this well and most likely never will.
Finally, one of the big (if hopefully temporary) reasons to use a relational database over MongoDB is if you’re using tools that don’t support MongoDB. From SQLAlchemy to Wordpress, there are thousands of tools that just weren’t built to support MongoDB. The pool of tools that support MongoDB is growing but is hardly the size of relational databases’ ecosystem, yet.

Question : Example of $lookup
Answer :

db.state.aggregate([
    {$match: {"statename": az}},
    {$project: {"statename": 1, zipcode: 1}},
    {$unwind: "$zipcode"},
    {
        $lookup: {
            from: "homes",
            localField: "zipcode",
            foreignField: "zipcode",
            as: "homeInformation"
        }

    }, {$match: {homeInformation: {$ne: []}}},

    {
        $project: {
            name: 1, "zipcode": 1,
            "homeInformation.apt": 1,
            "homeInformation.homeowner.fullName": 1,
            "homeInformation.address": 1
        }
    }
])

Everything

Search This Blog

MongoDB

Comments

Post a Comment

Popular posts from this blog

12 - HTML 5 and CSS

Collections JAVA

OOAP -Javascript