Casbah is a Scala toolkit for MongoDB  and it  integrates a layer on top of the official mongo-java-driver for better integration with Scala. 
The recommended way to get started is with a dependency management system.
libraryDependencies += "org.mongodb" %% "casbah" % "3.1.1"
Casbah is MongoDB project and will continue to improve the interaction of Scala + MongoDB.
Add import:
import com.mongodb.casbah.Imports._
---------------------------------------------
You could get the source from :
https://github.com/alvinj/
Then you could modify your
-----------------
spb@spb-VirtualBox:~/
organization := "com.alvinalexander"
name := "ScalatraCasbahMongo"
version := "0.1.0-SNAPSHOT"
scalaVersion := "2.11.8"
libraryDependencies += "org.mongodb" %% "casbah" % "3.1.1"
libraryDependencies += "com.mongodb.casbah" % "casbah-gridfs_2.8.1" % "2.1.5-1"
libraryDependencies += "org.slf4j" % "slf4j-log4j12" % "1.7.24"
resolvers += "Sonatype OSS Snapshots" at "http://oss.sonatype.org/
spb@spb-VirtualBox:~/
spb@spb-VirtualBox:~/
[info] Loading project definition from /home/spb/mongoConnector/
[info] Set current project to ScalatraCasbahMongo (in build file:/home/spb/mongoConnector/
[info] Compiling 1 Scala source to /home/spb/mongoConnector/
[warn] there was one deprecation warning; re-run with -deprecation for details
[warn] one warning found
[info] Running casbahtests.MainDriver
debug: a
log4j:WARN No appenders could be found for logger (com.mongodb.casbah.commons.
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/
debug: b
debug: c
debug: d
debug: e
debug: f
debug: g
debug: h
debug: i
debug: j
debug: k
debug: l
debug: m
debug: n
debug: o
debug: p
debug: q
debug: r
debug: s
debug: t
debug: u
debug: v
debug: w
debug: x
debug: y
debug: z
sleeping at the end
sleeping: 1
sleeping: 2
sleeping: 3
sleeping: 4
sleeping: 5
sleeping: 6
sleeping: 7
sleeping: 8
sleeping: 9
sleeping: 10
sleeping: 11
sleeping: 12
sleeping: 13
sleeping: 14
sleeping: 15
sleeping: 16
sleeping: 17
sleeping: 18
sleeping: 19
sleeping: 20
sleeping: 21
sleeping: 22
sleeping: 23
sleeping: 24
sleeping: 25
sleeping: 26
sleeping: 27
sleeping: 28
sleeping: 29
sleeping: 30
game over
[success] Total time: 62 s, completed 13 Mar, 2017 5:37:31 PM
spb@spb-VirtualBox:~/
spb@spb-VirtualBox:~/
[info] Loading project definition from /home/spb/mongoConnector/
[info] Set current project to ScalatraCasbahMongo (in build file:/home/spb/mongoConnector/
[info] Packaging /home/spb/mongoConnector/
[info] Done packaging.
[success] Total time: 1 s, completed 13 Mar, 2017 5:54:42 PM
spb@spb-VirtualBox:~/
----------------
There are two ways of getting the data from MongoDB to Apache Spark.
Method 1: Using Casbah (Layer on MongDB Java Driver)
Method 2: Spark Worker at our use
Better version of code: Using Spark worker and multiple core to use to get the data in short time.
---------------------------------------------------------------
Reference:
https://web.archive.org/web/20120402085626/http://api.mongodb.org/scala/casbah/current/setting_up.html#setting-up-sbt
The recommended way to get started is with a dependency management system.
libraryDependencies += "org.mongodb" %% "casbah" % "3.1.1"
Casbah is MongoDB project and will continue to improve the interaction of Scala + MongoDB.
Add import:
import com.mongodb.casbah.Imports._
---------------------------------------------
You could get the source from :
https://github.com/alvinj/
Then you could modify your
-----------------
spb@spb-VirtualBox:~/
organization := "com.alvinalexander"
name := "ScalatraCasbahMongo"
version := "0.1.0-SNAPSHOT"
scalaVersion := "2.11.8"
libraryDependencies += "org.mongodb" %% "casbah" % "3.1.1"
libraryDependencies += "com.mongodb.casbah" % "casbah-gridfs_2.8.1" % "2.1.5-1"
libraryDependencies += "org.slf4j" % "slf4j-log4j12" % "1.7.24"
resolvers += "Sonatype OSS Snapshots" at "http://oss.sonatype.org/
spb@spb-VirtualBox:~/
spb@spb-VirtualBox:~/
[info] Loading project definition from /home/spb/mongoConnector/
[info] Set current project to ScalatraCasbahMongo (in build file:/home/spb/mongoConnector/
[info] Compiling 1 Scala source to /home/spb/mongoConnector/
[warn] there was one deprecation warning; re-run with -deprecation for details
[warn] one warning found
[info] Running casbahtests.MainDriver
debug: a
log4j:WARN No appenders could be found for logger (com.mongodb.casbah.commons.
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/
debug: b
debug: c
debug: d
debug: e
debug: f
debug: g
debug: h
debug: i
debug: j
debug: k
debug: l
debug: m
debug: n
debug: o
debug: p
debug: q
debug: r
debug: s
debug: t
debug: u
debug: v
debug: w
debug: x
debug: y
debug: z
sleeping at the end
sleeping: 1
sleeping: 2
sleeping: 3
sleeping: 4
sleeping: 5
sleeping: 6
sleeping: 7
sleeping: 8
sleeping: 9
sleeping: 10
sleeping: 11
sleeping: 12
sleeping: 13
sleeping: 14
sleeping: 15
sleeping: 16
sleeping: 17
sleeping: 18
sleeping: 19
sleeping: 20
sleeping: 21
sleeping: 22
sleeping: 23
sleeping: 24
sleeping: 25
sleeping: 26
sleeping: 27
sleeping: 28
sleeping: 29
sleeping: 30
game over
[success] Total time: 62 s, completed 13 Mar, 2017 5:37:31 PM
spb@spb-VirtualBox:~/
spb@spb-VirtualBox:~/
[info] Loading project definition from /home/spb/mongoConnector/
[info] Set current project to ScalatraCasbahMongo (in build file:/home/spb/mongoConnector/
[info] Packaging /home/spb/mongoConnector/
[info] Done packaging.
[success] Total time: 1 s, completed 13 Mar, 2017 5:54:42 PM
spb@spb-VirtualBox:~/
------------------------------
spb@spb-VirtualBox:~/Scala_
MongoDB shell version: 3.2.12
connecting to: test
Server has startup warnings:
> show dbs
local 0.000GB
mydb 0.000GB
> show dbs
finance 0.000GB
local 0.000GB
mydb 0.000GB
> show collections
> use finance
switched to db finance
> show collections
stocks
> db.stocks.find()
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
Type "it" for more
>
-------------------------
spb@spb-VirtualBox:~/Scala_
MongoDB shell version: 3.2.12
connecting to: test
Server has startup warnings:
> show dbs
local 0.000GB
mydb 0.000GB
> show dbs
finance 0.000GB
local 0.000GB
mydb 0.000GB
> show collections
> use finance
switched to db finance
> show collections
stocks
> db.stocks.find()
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
{ "_id" : ObjectId("
Type "it" for more
>
-------------------------
----------------
There are two ways of getting the data from MongoDB to Apache Spark.
Method 1: Using Casbah (Layer on MongDB Java Driver)
val uriRemote = MongoClientURI("mongodb://RemoteURL:27017/")
val mongoClientRemote =  MongoClient(uriRemote)
val dbRemote = mongoClientRemote("dbName")
val collectionRemote = dbRemote("collectionName")
val ipMongo = collectionRemote.find
val ipRDD = sc.makeRDD(ipMongo.toList)
ipRDD.saveAsTextFile("hdfs://path/to/hdfs")Method 2: Spark Worker at our use
Better version of code: Using Spark worker and multiple core to use to get the data in short time.
val config = new Configuration()
config.set("mongo.job.input.format","com.mongodb.hadoop.MongoInputFormat")
config.set("mongo.input.uri", "mongodb://RemoteURL:27017/dbName.collectionName")
val keyClassName = classOf[Object]
val valueClassName = classOf[BSONObject]
val inputFormatClassName = classOf[com.mongodb.hadoop.MongoInputFormat]
val ipRDD = sc.newAPIHadoopRDD(config,inputFormatClassName,keyClassName,valueClassName)
ipRDD.saveAsTextFile("hdfs://path/to/hdfs")---------------------------------------------------------------
Reference:
https://web.archive.org/web/20120402085626/http://api.mongodb.org/scala/casbah/current/setting_up.html#setting-up-sbt
 
No comments:
Post a Comment