彙整

The 7th Week of GSoC

Summarizing!!! So, keep it brief.

What I Have Done

  1. Finished the sync work with OpenLDAP.
  2. Built a dynamic migrating solution for temparay usage.
  3. Built a simple migration script for migrating data from OpenLDAP to Mongo*, Here is the repo

What I Will Do

Deploy the Dashboard 2.0!.

That’s it.

The OpenMRS ID Dashboard 2.0

Finally, after weeks of exploring and attempting, the new dashboard is nearly in place.

The main purpose of Dashboard 2.0 is to provide a extendable user model for OpenMRS ID, so we used MongoDB as the backend database. Hence we can add some free-form data to database, as discussed and explained in these talks. My Midterm presentation Disscussion

Concern for LDAP

While gaining the new features, we can’t leave the older behind. So for backward compatibility, we have to reserve a LDAP server, so those Atlassian Crowd based apps could be happy as usual.

My original plan was to create a LDAP layer on top of the new user data model via ldapjs. However, after some attempts, I found that its not feasible, or more accurately, efficient. So I turned to the sync plan. Specificly, we’ll remain the OpenLDAP server and sync with it. Although this sync is one-directional, that is, we can only sync the changes from MongoDB side to LDAP.

For data migration, currently I don’t have a good idea, because I don’t know much about the production. So for quickly putting Dashboard 2.0 into production and test, I choosed a dynamic migration approach. In detail, I’ve bound a query of LDAP to one query method of Mongo, when I don’t find any record in Mongo I’ll query in LDAP, and if there is one in LDAP, I’ll copy that in Mongo.

New Procedure for Signup

I’ll demonstrate this in one diagram, see.

Signup Procedure

Something about Data Model

Our user related data models are very simple. We only have user and groups schema.

Except from those basic attributes, 2 things need to be mention exclusively.

The user.extra

it’s in Mixed type. So you may treat it as a normal json object.

In future, we’ll use that to store all kinds of other things other clients put into via our API. See this in detail, Disscussion

The relation between Groups and Users

One user may be member of different groups, and each group has different users, so it’s a “many to many” relation.

To manage relationship in Mongo*, just store the ObjectId of one doc into another, so you can easily reference each other

But considering the number of groups will be small anyway, so for the sake of simplicity, I’ve just stored the group names in user docs.

And to easily get all the members of one group, each group will have a userList array containing usernames and ObjectIds. Having usernames known, we don’t have to really query for users in most cases.

However, Mongo* don’t have built-in mechanism to query the array belong to one document. But again, considering we won’t have too much users :|, I’ll just do a plain O(n) one by one search. No need to use any data structrue and algorthim, hahahahhhhhhh…

That should be it.

The 6th Week of GSoC

Time to summarize again.

Well, I havn’t done many things this week… Because it’s the last week of this semester as well. I’ve been preparing my last final, and celebrating with my classmates after it.

Anyway, here is the usual list.

What I Have Done

  1. Basically finished all the integration work.
  2. Created a prototype of the new /profile page. And here is the mockup. mockup
  3. Fixed few trivial bugs.

What I Will Do

  1. Make up all the validation work.
  2. Start to work on ldapjs.

That’s it.

The 5th Week of GSoC

It’s the 5th week, the last week before the middle term evaluation.

Oh… I’m exhausted, just having recorded my midterm presentations. It costed me about 1 wholed day to make the ppt, and 4 hours to record the video… Well my English is bad, so I have to retake it again and again. Even though the outcome isn’t very well, because after long time of working and speaking, I made my voice hoarse, my mind stupid…

You can see my presentaion here.

Anyway let’s sum up this week.

What I Have Done

  1. Completed the integration work with signup module and new data model.
  2. Integrated the new data model with the auth module.
  3. Built new model for verification.
  4. Used the new model to reimplement the email-verification work.
  5. Made a middle term presentaion, see above.

What I Will Do

  1. Complete the integration work for reset-pwd and profile module.
  2. Figure out how to migrate data.
  3. Fix some issues maybe.

That’s it, I need to sleep.

The 4th Week of GSoC

Let’s sum up the 4th week of GSoC!!!

First, another hard one, I’ve used a lot of time to fix bug of some async logic. :/

What I Have Done

  1. Roughly finished the work of integrating the new Mongo data model with the signup module. Now you can create new account in Mongo, but you can’t log in yet…

  2. Fixed the ID-6. Small defect about the login hint message, easy to fix, just adding a link will be fine. But I think there is still some room for improvement. Maybe when user filled out the form, they can be redirected to the original page. Or simply make it default to open a new page for that link, since there is still the verification work.

  3. Fixed the ID-14, which is about the verification email. The old dashboard will sent the verification email, even if the account isn’t got created. This can be easily solved by using async library. Just simply using async.series() will do the trick.

  4. Reimplemented the validation middleware for signup module. Again, by using async, the whole control flow is clearer now. I’ve spent most of my time on that, because I made some rookie mistakes on writing async code in Node.js.

  5. Created and fixed the ID-29. This issue is about the reuse of typed values of validation form. If the user typed some wrong values, the old dashboard won’t cache them, so user have to type them again. Small bugs, but annoying.

What I Will Do

  1. Complet the work of signup module, and march on auth(login logout) and profile module.

  2. Continue improve the validation code.

  3. Fix some more issues.

  4. Test the Mongoose to see whether is possible for users to change the schema when the server is running. And whether that is a good idea… For details you can see this talk posted by Elliott.

That’s it.

The 3rd Week of GSoC

Oh the 3rd week has ended, time for sum-up. It’s been a really really tough week :(

Those Mongoose living in MongoDB are so hard to catch!!!

Let it be terse and concise.

What I Have Done

  1. I’d spent tons of my time messing around with MongoDB and Mongoose, to build the new data model for the Dashboard. This hard process really enhanced my understanding of Node and database stuff. This post are one of the fruits.

  2. Found some bugs of the dashboard, see ID-24, ID-25.

  3. Created one wiki page for the new data model design, and opened a talk topic to discuss some ideas about the new OpenMRS-ID.

  4. Created a new-db branch for the mongodb development.
  5. Used mocha as the test frame, and adopted the BDD style of Chai. See this file as example.
  6. Gained SSH access to the staging server.
  7. Learned slight css knowledge, now the blockquote will align to left :)

What I Will Do

  1. Continue working with Mongo*.
  2. Fix some issues on master branch, and avoid making conflicts with the new-db branch.
  3. Configure the mongo of the remote.
  4. Add more unit tests.
  5. Try out the ldapjs with Mongo*.

That’s it. Time for sleeping…

Dealing With Unique Index of Mongoose and Mongodb

Mongoose provide a unique attribute for schema types. By setting it as true, you can easily create uniqe index for an attribute.

However there are some issues you must know.

Mongodb won’t ensure one item to be unique from others in the same array.

Say, you want to have a email list for your user schema, and you must ensure all members in the email list are unique from anyone else, whether they are in the same array or not. So you set this.

1
2
3
4
5
6
7
var userSchema = new Schema({
emailList: {
type: [String],
unique: true,
},
});
var User = Mongoose.model('User', userSchema);

And now, if you try this

1
2
3
4
var user1 = new User({emailList: ['john@doe.com', 'foo@bar.com']});
var user2 = new User({emailList: ['foo@bar.com']});
user1.save();
user2.save();

You’ll get an E11000 error as desired, everything seems to be fine. But try this,(If you don’t get this error please see section below)

1
2
var user3 = new User({emailList: ['john@doe.com', 'john@doe.com']});
user3.save();

It will pass the unique test… That’s definitely not what we want.

The solution is simple, create one your own validator. In this situation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var chkArrayDuplicate = {
validator: function (arr) {
var sorted = arr.slice();
sorted.sort();

var i;
for (i = 1; i < sorted.length; ++i) {
if (sorted[i] === sorted[i-1]) {
return false;
}
}
return true;
},
msg: 'Some items are duplicate'
};

And note that, emails are case-insensitive, so you should add a lowercase setter as well. Mongoose will apply setter first and then the validator, so that will be enough. It goes like below,

1
2
3
4
5
6
function arrToLowerCase(arr) {
arr.forEach(function (str, index, array) {
array[index] = str.toLowerCase();
});
return arr;
}

Sometimes unique index won’t work as you desired.

It’s normal and often that we’ll change our schema during the development & testing. Sometimes when we changed our schema and added the unique qualifier, Mongoose and Mongodb won’t reflect our changes. The unique index won’t be generated, thus you won’t ensure the uniqueness.

Sometimes, that happens when you have stored some duplicates in collections before.

But fix this problem won’t be as simple as droping the collections or the databases, or even you restart the mongod instance. Because the problem may lie in your codes.

You can first check your collections index, by this command in mongo shell, (I’ll use previous user collections as an example)

use your_testing_db
db.users.getIndexes()

If the index was created, then something terribly may have happened. Or most likely, you won’t have that index. You can add this listener to your model,

1
2
3
4
5
User.on('index', function (err) {
if (err) {
console.error(err);
}
});

This listener will listen the index event created by ensureIndex().

When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema.

See Indexes and Model.ensureIndexes.

And keep in mind that, everything in Node.js are asynchronous, so the ensureIndex() is as well. So everytime when you tried to fix the problem, you first drop the collection or the database, and the index will also be dropped. And then you run your code again, node fired ensureIndex() and before it has done its job, save() got fired. So you’re messed up as well.

So the best you can do is to keep index created before you write some data to mongo. On production, you’d better create your collections on the database first.

When you test your code, better use another database. And because we need to constantly drop something when testing, you should always listen the index event and make sure you write something after that.

Empty array will be counted

That’s a weired feature of Mongodb, if you use unique index, you can’t have two documents have empty array. It will be count as duplicates, Even if you use sparse. You can check this SERVER-3934.

That’s it.

The 2nd Week of GSoC

The second week has ended, another fun week, yeah!

So let me sum it up.

What I Have Done

Refactoring

Well, though in the last week I successfully rearranged the routing logic. When I tried to refactor other source files, I found it’s very hard, well almost impossible, to correctly refactor without any mistakes, when there is no tests.

And considering these files will most probably be replaced, so I decided to suspend the refactoring work. So I just simply used JsFormat, one plugin of Sublime Text, to format those files. So they are more readable now. :D

Fixing ID-12 & ID-22

OpenMRS uses a lot of team-work stuff, like Wiki and Jira. When people found some bugs or want to add some new functionalities, they can report it on Jira. And here is the categories for ID-Dashboard.

When I worked on the refactoring, Elliott asked me to create a new issue on it, so we can publicly track my work, and let people comment on that issue. And by that chance I found there are a lot of issues lying there waiting for development.

So I created my refactoring issue, ID-22 and picked up ID-12 and ID-19. Then successfully solved them.

ID-19 is just a simple typo fix, and ID-12 is about the session creating problem.

Elliott found that there are a lot extra sessions were created in database, and it blame to the global-navbar. Because the navbar was a sub module of the dashboard, and when other modules like Wiki want to use it, they could make a get request for dashbordHOST/globalnav and added it on themselves according to the response.

And then due to faultily designed of Dashboard, it will create session for all requests, whether they are needed or not. Hence, not only the /globalnav, but also some thing like /resource/* will create sessions. That’s terrible, it will add unnecessary pressure to the DB.

And so, I took a look into the sources, and found that, the real problem lies in the way of using express.session middleware. The source just directly used app.use() for it, this will make express to generate session for all routes.

After some searching, I found the best practice maybe store the session middleware first, and then use it when necessary. But later I found it will be a huge modification, ‘cause there are other middlware depends on session and they are used globally as well. So instead, I simply created a exception list for the session middleware as a temporary fix.

And in the process of solving this issue, Elliott told me that we can create a subApp for those submodules, and then let the main app use it. Like,

1
2
3
4
5
6
7
var subApp = express.createServer();
// don't call subApp.listen
subApp.get('/some', function (req, res, next) {
// do something
})
// do something else
app.use('/parentUri', subApp);

And now you can visit /parentUri/some.

That is a very good feature that could make less coupling. However, note that the subApp works like a middleware. So it will be infulenced by other middleware that the main app used.

Starting to Dig Mongoose

First, you need to configure MongoDB instance, and I did. The details are here.

What I Will Do

  1. Continue to configure the db stuff, and make a guide for that.

  2. Starting to design the basic data model with Mongoose.

  3. Begin to study the unit testing, like Mocha and Jasmine.

  4. Fix more issues maybe.

So that’s it, I need to do my laundry :<

Adding Users for Mongodb

Today after messing around with MongoDB for hours. I finally figured out one thing… adding a user with password for a specifc database of MongoDB. :<

So I better write a blog to note this down…

My situation was simple, there is only one single instance. So I don’t need to touch those complex ‘replica set’ stuff.

So, few things to note:

First enable the auth mechanism, if you use the config file, modify it. By default, config lies in /etc/mongod.conf. Uncomment this line,

auth = true

Or you can start mongod with --auth option.

Then use mongo admin to connect to server and switch to the admin database, in where you’ll create the Admin user, and use this admin user to create a user for our db.

If you only want this user to have the minimum privileges to create other user. You can make its role as userAdminAnyDatabase;

However, this role is very limited. So for development convenience, I used root. So the commands are,

mongo admin

db.createUser( {
    user: 'userNameHere',
    pwd: 'passwordHere',
    roles: [
        { role: 'root', db: 'admin'}
    ]
})

If you got db.createUser() is not a function, please check your mongo’s version. This method was not introduced until 2.6. And you can use db.addUser() in previouse versions.

Then switch to the database you want, and create another user with readWrite role.

use yourDB

db.createUser( {
    user: 'userNameHere',
    pwd: 'passwordHere',
    roles: [
        { role: 'readWrite', db: 'yourDB'}
    ]
})

That’s it, by now you can use this user to connect mongo. You can test it by using db.auth('username', 'password')

Difference between "exports" and "module.exports"

In node, if we want to something to be acessible outside, we should export it by assigning it to an attribute of exports or module.exports. So we can use it by simply requireing it in other files.

But, sometimes we want to use require like importing a function. Rookies might write something like these.

1
2
3
exports = function() {
// something
};

But this approach will certainly fail. Actually if you want to export a function use module.exports, instead of exports.

The reason lies in the implementation of Node’s module. In Node, every source file is a independent module. The module object definition is like below, we’ll just focus on exports.

1
2
3
4
5
6
function Module(id, parent) {
this.id = id;
this.exports = {};
this.parent = parent;
// something else
}

And when Node compile your source file, it will wrap it first.

1
2
3
(function (exports, require, module, __filename, __dirname) {
// your codes
})

Then node will call this function and pass those parameters to it, so we can use them in our files. The exports is the shortcut for module.exports. So even if you reassign it a new value, the original object module.exports won’t be affected at all.

Conclusion

In a nutshell exports is just a reference to module.exports, don’t assign it a new value. Instead use this.

1
2
3
exports = module.exports = function() {
//blahblah
}