5

My program should perform following task: It listen on http port after getting request it does following things.

  1. Connect to gearman
  2. Parse gearman payload to JSON (Upto 100 bytes)
  3. Connect to Redis
  4. Parse redis payload to JSON (256 bytes to 10KB. 80% cases it will ~256 bytes)
  5. Put some data in MySQL
  6. Put data in Redis server

As my program seems to be IO driven. I have chosen nodesjs for developing. But after developing I am facing CPU hike related issue with nodejs.

My program taking 70%-100% cpu with 20 parallel clients. First I thought JSON parsing could be the issue. I was targetting near about 1K-3K request. As my redis server is able to process that may request in one second.

But for profiling I have started with one sample http server in node

Example code:

var http = require('http');
var url = require("url");
http.createServer(function (req, res) {
    var uri = url.parse(req.url).pathname;


    var body = "";

    req.on('data', function (chunk) {
        body += chunk;
    });

    req.on('end', function () {     
        res.writeHead(200, {'Content-Type': 'text/plain'});            
                res.end('hi vivek');
    });


}).listen(9097, "127.0.0.1");

Now my concern is with this hello world http server. Node CPU usage is spiking b/w 17%-20%.

My node version is v0.10.0
My OS is ubuntu 12.04

My cpu information is

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 2992.491
cache size  : 6144 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bogomips    : 5984.98
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 2992.491
cache size  : 6144 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bogomips    : 5984.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

My questions are:

  1. Is node correct choice for my problem description?
  2. If nodejs is not correct choice for my problem description. What will be better alternative? As thread based approach will not scale for IO driven application.

  3. How to find out what is causing that much CPU hike in complete integrated application and simple http program?

  4. According to some node blog, I can support upto 10K parallel request with nodejs. But with if with only simple http server node is spiking b/w 20% cpu. How I will be able to support 10K user?
7
  • Yes. I've personally had very good experience with node for this sort of thing. (although, don't buy all the threads don't scale garbage people tell you, it's possible in a threaded server too. node just plays nice with the technologies you mentioned and was built for this sort of thing) Commented Jun 2, 2013 at 17:24
  • @BenjaminGruenbaum But it is already spiking ~70%-100% with 20 clients. How much I will be able to scale? and how to pinpoint what is causing high spiking?
    – Vivek Goel
    Commented Jun 2, 2013 at 17:27
  • Honestly? I have no idea why you're getting that sort of performance. Update node to 0.10.9, make sure the OS is set up correctly (with abundant file descriptors and such). I have a slightly faster computer (i7 2600K oc'd), the load I'm getting for 20 clients is of an HTTP echo server way under 1% cpu, 15K concurrency is done without any additional set up or tweaking. Nowhere near what you're seeing. Commented Jun 2, 2013 at 17:30
  • How is your response time? How does it scale as a function of load? I understand that CPU utilization is high, but what else do you see?
    – Brandon
    Commented Jun 2, 2013 at 18:13
  • I can't find the cite atm, but I think there's no way any server-side platform would outperform Java right now. There are NIO frameworks for Java if you're concerned about concurrency.
    – jiggy
    Commented Jun 2, 2013 at 21:10

1 Answer 1

6

Is node correct choice for my problem description?

Nodejs seems like a good fit to what you're doing. Node was built for exactly this sort of scenario. That's not to say other technologies wouldn't work as well.

Node is a young technology, and you often find yourself sacrificing comfort for performance. Often it's a lot more work, but once you learn how to work with it, it starts to be rewarding.

That said, other technologies might be able to accommodate your needs.

Pros for node

  • Fast
  • Optimized for this sort of IO driven task
  • Has an enthusiastic community which is very interested in helping beginners.
  • Fun to work with (That's totally subjective and is my personal opinion).

Cons for node

  • Often has less stable and mature drivers. If you're writing a production project, this is a biggie in my opinion.
  • New, sometimes has rough edges, sometimes APIs change.
  • Often requires tweaking and reading source code to get working in a satisfactory way.

Your task:

Connect to gearman

This node does nicely. node-gearman works nicely, it's pretty stable.

Parse gearman payload to JSON (Upto 100 bytes)

JS engines have been and will be extremely fast at parsing JSON. This is because JSON is a subset of JavaScript object literal notation (hence the name!). V8, which is the engine node runs on does JSON processing reliably fast.

Connect to Redis

node-redis lets you do that, it also works nicely.

Parse redis payload to JSON (256 bytes to 10KB. 80% cases it will ~256 bytes)

Again, JSON is not an issue to V8.

Put some data in MySQL

node-mysql is getting better, it still lacks support for prepared statements, but it does transactions, and emulates prepared statements with internal escaping.

Put data in Redis server

Again, node-redis

4
  • I don't see how Javascript has a distinct advantage in parsing JSON. First of all, JSON is not technically a subset in that line and paragraph separators (\u2028, u2029) are illegal in Javascript string literals but not in JSON string literals. Secondly, even if it was, I still don't see how the syntactic similarity could possibly affect performance. If it's because javascript has direct equivalents of JSON values, then, so do many other languages but it's still unrelated to syntactic similarity.
    – Esailija
    Commented Jun 2, 2013 at 20:16
  • I never said JavaScript has a distinct advantage in parsing JSON, or an advantage at all (It's not even in the "pros" section, or debated). All I claimed is that JS parses JSON extremely fast, not saying other languages do so slowly, not saying that's an advantage for JS, only that modern JS engines got to the point where serializing or de-serializing JSON is almost never an issue. Commented Jun 2, 2013 at 20:52
  • What do you mean extremely fast? It is general purpose textual format, it's going to be much, much slower than custom binary formats usually used in database<->application comm.
    – Esailija
    Commented Jun 2, 2013 at 21:16
  • What I mean by extremely fast, is that in practice it is not a limiting factor. Commented Jun 3, 2013 at 23:29

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.