Tuesday, April 14, 2015

Data Analysis from MongoDB using R

Most of us are aware of R, is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical softwares and data analysis. If we empower R with proper datasets and sources it would be the icing on the cake, so in this post we are going to see how, R would be connected to the MongoDB and how one can apply R power or datasets from MongoDB.

Prerequisites for this demo, you should have MongoDB daemon up and running on server or on your local machine(pseudo distributed mode) 

Start your R instance and install "rmongodb" packages by issuing below command(s)

        $  install.packages("rmongodb")
        $  library(rmongodb)

connect R with MongoDB instance
       $ mongo.create(host = "", name = "", username = "", password = "", db = "test", timeout = 0L)

you'll get response as below, using above connection configuration you are connecting to the mongo instance on to the 'test' mongo database with empty username and password.

        [1] 0
        <pointer: 0x0884f0a8>
        [1] "mongo"
        [1] ""
        [1] ""
        [1] ""
        [1] ""
        [1] "test"
        [1] 0   

you can check by issuing below command, whether R is connected to MongoDB or not.

        $ mongo.is.connected(mongo)
        [1] TRUE

Now your R is successfully connected to MongoDB instance to test database, so you can easily fire a simple mongo queries and use R's power to calculate analytics over mongoDB datasets.

for example to get simple one record from Mongo

        $ mongo.find.one(mongo,"test.zip",list())

we can also use filter queries to fetch records from MongoDB into R datasets,

        $ mongo.find(mongo, "test.zip", list(pop=list('$gt'=21L)))

So, this just a beginning stay tuned for the next updates.
Thanks for visiting, I'll appreciate your thoughts and comments