Tuesday, April 16, 2013

Go (golang) And MongoDB - My First Example

As you probably know, Go is a language under development by Google and MongoDB is a database system created by 10gen. You can find instructions for installing Go (including for Windows, which I use) at http://golang.org/doc/install . You can also install MongoDB from http://www.mongodb.org/downloads . You will need to create a /data/db (Unix-based) or C:\data\db (Windows) directory structure for MongoDB to store your data.
You'll need other things to run this example. First, you'll need Bazaar's bzr command. You'll find what you need at http://wiki.bazaar.canonical.com/Download. Once you have that installed, go to a command prompt and enter

go install labix.org/v2/mgo

That should install the mgo (pronounced "mango") package. You might need to set the GOPATH environment variable first; I suggest setting it to the "pkg" subdirectory of Go's installation path. GOPATH tells Go where to expect/install packages. Setting it to Go's installation directory itself isn't always a good idea. I learned that the hard way. Labix's installation script uses the bzr command; that's why you need Bazaar.
Now you're ready to do some coding. This example will show you how to retrieve first and last names from the user (it's a console app), check the database to see if that name was already entered, save the user's input if it wasn't, and finally display all the names in the MongoDB collection. For those not used to document-based NoSQL, a "collection" is similar to a table; "documents" are similar to records. The principal difference is documents have no set structure; there is no schema forcing them to have anything is common except having an ID. Some document-based systems might organize your documents for you based on their "plain-old" (POJO/POCO/etc) class. Also, some (like CouchDB) expect to see other fields, like a revision code.

Anyway, on to our code. The first things you do in a Go program file are declaring a package name and retrieving the packages your file needs. In our case, that would be done as follows...


package main

import (
"fmt"
"labix.org/v2/mgo"
"bufio"
"os"
"strings"
"labix.org/v2/mgo/bson"
)

The "fmt", "bufio", and "os" packages handle the console I/O. As you might expect, the "strings" package has the methods for string-handling. The rest are for interacting with MongoDB. Go doesn't have classes in the usual sense, just structures. Unlike in C/C++ however, you don't need header files. You can also attach functions to Go structures easily, as you will see. Here are the structures and the function we will need:


type Name struct {
Id bson.ObjectId `bson:"_id"`
MyName QueryName
}

type QueryName struct {
FirstName string `bson:"FirstName"` 
LastName string `bson:"LastName"`
}

func (q *QueryName) ConvertToInterface() (interface{}) {
return  map[string]interface{}{
"FirstName": q.FirstName,
                        "LastName": q.LastName,
                }
}  



I did things this way to reduce code duplication; you could specify first-name and last-name fields in the Name structure directly instead. The Id field as written is necessary for creating MongoDB documents. Other parts of MongoDB tend to misbehave without it. The  `bson:` clauses tell mgo how to serialize the Go fields into BSON format, which is how MongoDB stores data. Without it, that field won't be stored. I created two structures because sending MongoDB queries structures with Ids can cause problems. The ConvertToInterface function translates the QueryName structure into a string-indexed "array" that MongoDB queries can use. The "q" pointer works like the "this" pointer in object-oriented languages; it also attaches that function to the QueryName structure.
The next thing is to start creating the entry point function and connect to the database. Assuming that you're running the code on the same machine as MongoDB, that would be


func main() {
session, sessionErr := mgo.Dial("localhost")
if sessionErr != nil {
fmt.Println(sessionErr)
} else {
fmt.Println("Session created")

Once you realize Go functions can return multiple values, that should be self-explanatory. The Dial function connects the code to MongoDB. The := operator initializes variables. In Go, variables can be defined implicitly by that operator. Once a variable is initialized, you use the = operator.
Now, to ask for and read the user's input, you would use


r :=bufio.NewReader(os.Stdin)
fmt.Print("First name: ")
first, _ := r.ReadString('\n')
fmt.Print("Last name: ")
last, _ := r.ReadString('\n')
first = strings.TrimSpace(first)
last = strings.TrimSpace(last)

This is the best way to read the keyboard in a console app if the input can contain spaces. If you've used older C-style languages, os.Stdin should look familiar as the keyboard buffer. ReadString('\n') gets the input as a string that ends when the user hits the Enter key. Unfortunately, it includes the newline in the string. TrimSpace strips that character off.
Now, it's time to finish connecting to MongoDB and look up the input:


database := session.DB("go_mongo")
collection := database.C("names")
/* collection.DropCollection()
collection = database.C("names") */
nameForQuery := QueryName{FirstName: first, LastName: last}
// query1 := collection.Find(bson.M{"myname": bson.M{"FirstName": first, "LastName": last}})
param := nameForQuery.ConvertToInterface()
query1 := collection.Find(bson.M{"myname": param})
count, _ := query1.Count()

If you haven't guessed, "go-mongo" is the name of the database, and "names" is the collection's name. If you're familiar with C-style languages, you'll probably recognize the comments. The commented code can be omitted. If you're curious, DropCollection destroys the collection. As you can see, you can create a new version of the collection after destroying it. Both Find function calls will work; they feed a first and last name to MongoDB to query for exact matches. As you might have guessed, the Count function retrieves the number of matches. It also optionally returns an error code, which you would check against nil to detect actual errors. To add the data you would do something like


name := Name{Id: bson.NewObjectId(), MyName: QueryName{FirstName: first, LastName: last}}
add_err := collection.Insert(name)
if add_err != nil {
fmt.Println("Error on add:", add_err)
} else {
fmt.Println("Name was successfully added")
}

"name" is a structure of type "Name", initialized with the entered data and a newly generated document Id. Any such structure could be added to the collection. If your domain structure is annotated like mine, with an identical Id field, it should work in your own code. Substructures and arrays are perfectly acceptable; those are some of the beauties of NoSQL.    
Now for the fun part, namely dumping a whole collection. Don't let other people's documentation get you down; it can be done with some trickery. Here is how I did it...

         var results []Name
collection.Find(nil).All(&results)
for _, name := range results {
fmt.Println(name.Id, ":", name.MyName.FirstName, name.MyName.LastName)
}

The ampersand denotes a by-reference argument. That is another way a Go function can return values. As you can see, I'm querying on "nil", that is to say, essentially nothing! That will retrieve the whole collection. The same trick works with CouchDB/Couchbase; it might work with other document-based systems. If your whole collection is nothing but Names, this code will work. If you mixed document types in the collection (which you can do), something like this might be necessary...




var results2 []bson.M
                collection.Find(nil).All(&results2)
for _, obj := range results2 {
fmt.Println(obj)
}

The for... range syntax is how Go does a for-each. "bson.M" is mgo's default document structure; it works like a string-indexed array (map).  Lastly, you close the connection as follows...


session.Close()
}
}

To run the Go interpreter, simply use the syntax go run program.go .  To compile a Go program, run go build program.go .  Go scripts always have the extension ".go". In Windows, "go build" creates an .exe file if successful. "go build -o filename" enables you to use a target name other than the default. Instead of ".dll", Go uses ".a" for libraries.

I hope you found my code useful. Check out my other blogs too. Bye for now, and God bless!