C++ JSON libraryLast edited on Jul 9, 2012

After spending some time trying to find a good JSON library for C++, I realized that all the libraries out there are too heavy to use. Some of them look very good but their usage looks heavy. So I decided to write my own. My library is compliant with RFC 4627 except that it doesn't support unicode and numbers in exponential format .

Seriously, this library is really easy to use and has no dependencies (other than STL). I cannot find another C++ json library that is that simple to use.

Usage examples

The library exposes one object that is used to do everything you need. the "JSON" object. So there is no need to include a whole bunch of header files and use a whole bunch of class. You only need the JSON object to do everything you need. The object exposes these methods:

JSON& operator[](int i);Access a list item
JSON& operator[](std::string str);Access an object member
std::string str();Get value of item
void parse(std::string json);Parse a JSON document
std::string stringify(bool formatted=false);serialize JSON object
JSON& addObject(const std::string& name="");Add object
JSON& addList(const std::string& name="");Add List
JSON& addValue(const std::string& val,const std::string& name="");Add string value
JSON& addValue(int val, const std::string& name="");Add integer value
JSON& addValue(double val, const std::string& name="");Add double precision FP value

Reading a JSON document

The library is very simple to use. Just compile it and it will output a "test" executable and a .a that you can link against. Then let's say you have the following JSON document:


The following code is an example on how to use the library:

JSON json;
std::string val;
std::string str = someFunctionThatReadsAJSONDocumentFromFileOrNetworkOrWhatever();
val= json["obj1"]["member1"][0].str(); // would give "val5"
val= json["list1"][1]["listObject1"].str(); // would give "val2"

Each access to a member will return a JSON object. So you only have 1 class to use at all time. So you can create a new variable each time or you can access all members by chaining the function calls.

val= json["obj1"]["member1"][0].str(); // would give "val5"
JSON& j1 = json["obj1"];
JSON& j2 = j1["member1"];
val = j2[0].str(); // would give "val5"
val = j2.str();    // would give "{list}"
val = j1.str();    // would give "{object}"

Invalid paths

The nice thing about this is that you don't need to worry about null objects. If you try to access an invalid member, you will get an invalid JSON object. But if you try to access another member from an invalid JSON object, you will also get an invalid JSON object. You will never get a NULL object that could crash your application.

val= json["obj1"][1].str(); // would give "{invalid}" because obj1 is an object, not a list
val= json["obj2"].str(); // would give "{invalid}"
val= json["obj2"]["member2"].str(); // would also give "{invalid}"
val= json["list"][100].str(); // would give "{invalid}"
val= json["list"][100]["something"].str(); // would also give "{invalid}"

Writing a JSON document

There are 3 methods provided to add items in the JSON document:

  • addObject(const std::string& keyName="")
  • addList(const std::string& keyName="")
  • addValue(const std::string& val, const std::string& keyName="")

All 3 functions have an optional keyName parameter. That is because if you add an item to an Object, you need to specify the key name that will be used in the parent. Again, I wanted to have a simple interface without having to force the programmer to use different classes if using a list or an object. So this here is the behavior of those function calls if you provide the key name or not.

addXXX on Object and provide key name item added and parent uses keyName
addXXX on Object and don't provide key name item added and a key name is auto generated
addXXX on List and provide key name item added and key name ignored
addXXX on List and don't provide key name item added
addXXX on Value and provide key name operation is ignored
addXXX on Value and don'tprovide key name operation is ignored

After adding items in the json object, you can then serialize it with stringing().

JSON json;
json.addValue("val1"); // key autogenerated because key name not provided
std::string val = json.stringify(true);

Would output:



Project can be found on github

ACD Server with resiprocateLast edited on Jul 9, 2012

This is another project I've been working on using resiprocate. I'm using this ACD server with Asterisk. All devices, including the ACD server are talking with asterisk. The ACD server registers with asterisk just like any other phone would. When my gateway (FXO) sends a call to asterisk, asterisk routes it to the ACD server. When an agent logs on the ACD, the ACD server checks the contact header and establishes a presence subscription to that contact. In my case, this is always going to be an extension on Asterisk.


Feature list:

  • multiple queues
  • Agents can be members of many queues
  • Queues are called with AOR (i.e: telemarkerqueue@acdserver.local)
  • most-idle agent routing (MIA)
  • redirect on no answer (to next MIA, with configurable timeout)
  • music while waiting
  • welcome message + periodic announcements while in queue
  • supports g.711 uLaw only
  • supports SIP info only for DTMF
  • REST api with JSON formatted responses
    • list of queues and calls with state, source/destination etc.
    • list of agents with state, idle time etc.

Unfortunately, there is too much I want to do so here is the list of other features I would like to add when I find the time

  • Adding agents & queues dynamically without restart
  • Ringall queues
  • Calling an agent directly (agent@acdserver.local)
  • Force a call out of the queue using REST api (unattended transfer to any device on same call server).
  • Use one thread only for all RTP sessions
  • Prevent agent to log on more than one phone and more than one agent on same phone whatever codec I use)

Using it

Here is a typical scenario:

  • Agent dials *44 (asterisk routes this to sip:acdlogin@acdserver.local as per dialplan)
  • ACD Server answers and prompts for agentID
  • ACD finds agent in internal list
  • ACD subscribe with calling phone (using contact header).
  • agent becomes available when a notify indicates IDLE.
  • agent dials *45 (asterisk routes this to sip:acdlogout@acdserver.local as per dialplan)
  • ACD unsubscribes and set Agent as unavailable


In sip.conf, under the profile for my acdserver, I set:


When phone1 calls the acdserver, the contact header will appear to be sip:phone1@asterisk.local to the ACD server (because asterisk is a B2BUA). So when the ACD server will try to subscribe to that device, it will need to have a corresponding entry in the dialplan. This is how I setup my dialplan:

exten => _.,1,Dial(SIP/${EXTEN})

exten => _.,hint,SIP/${EXTEN}

It is discouraged to use "_." but currently, this is my only option. I'll try to find something. But this is because I don't have a naming convention for my devices. If all your devices are called phone1, phone2, phone3 etc. then you could use exten => _phone.,1,Dial(SIP/${EXTEN})


The server listens for incomming requests in the form of a RESTful API. The responses are sent as JSON data. I prefer JSON over xml since it is easier to parse with javascript and it also looks nicer in my opinion. I used my own json library which you can also find on this website. The server currently supports 2 requests.
curl -X GET "" would return

    "queues": [
            "name": "queue1",
            "calls": [
                    "id": "335099e931f09ad46ea75b8a451ad65d@",
                    "state": "assigned",
                    "from": "gateway@",
                    "to": "queue1@"
            "name": "queue2",
            "calls": [ ]
            "name": "queue3",
            "calls": [ ]

curl -X GET "" would return

    "agents": [
            "id": "2771",
            "state": "idle",
            "idletime": "14",
            "device": "",
            "memberof": ["queue1","queue2","queue3"]
            "id": "2772",
            "state": "loggedout",
            "idletime": "0",
            "device": "",
            "memberof": ["queue2"]
            "id": "2773",
            "state": "loggedout",
            "idletime": "0",
            "device": "",
            "memberof": ["queue3"]


This code is experimental and is a mess right now. A lot of it can change at any time. The only libraries you will need is resiprocate and ortp.

Implementing your own mutex with cmpxchgLast edited on Jun 28, 2012

The cmpxchg instruction takes the form of "cmpxchg destination source" where the destination is a memory location and the source is a register. Before using this instruction, you need to load a value in the EAX register. The instruction will first compare the value in EAX to the value in memory pointed by the destination operand. If both values are equal, the value of the source operand will be loaded in memory where the destination operand points to. Note that this compare and store operation is done atomically. If, on the other hand, the destination and EAX do not match, then the destination will be loaded into eax. At first, it might not be clear why this instruction would be usefull. But consider this:

l2: mov eax,[mutex]
    cmp eax,1
    je l2
    mov eax,1
l3: mov [mutex],eax

This is an unsafe way of creating a mutex. You loop until its value is zero and then set a 1 in it. But what if another thread or another CPU changed the value between l2 and l3?

If you need to store the value of a lock in memory (let's say at location 0x12345678) then before attempting to lock a section of code, you would read the lock to see if it is free. So you would read location 0x12345678 and test if this value is zero. If it isn't, then keep on reading memory until it reads as zero (because some other thread cleared it). After that, you would need to store a "1" in this location to take ownership of the lock. But what if another thread takes ownership between the time you read the value and the time you wrote it? The CMPXCHG instruction will write a "1" in there only if a "0" was in memory first. EAX would be equal to "0" because we would first spin until the memory value is "0". So after that, we tell the CPU: "EAX is zero now, so compare value at 0x12345678 with EAX (thus 0) and change it to 1 if it is equal. Otherwise, if the value at 0x12345678 is not equal to 0 anymore, then load this value into EAX and I will go back to spinning until I get a zero". Simple enough? Here is a sample code that illustrates this.

    mov edx,1
l2: mov eax,[mutex]
    cmp eax,1
    je l2                   ; spin until we see that eax == 0
    lock cmpxchg [mutex],edx; At this point, eax=0 for sure. Now if memory location still equal to
                            ; eax, then store edx in there.
                            ; otherwise, eax will be loaded with changed value of mutex (should be 1)
                            ; if not equal to zero, it means it was modified. If it was modified,
    jnz l2                  ; it means cmpxchg has loaded the value of the mutex in it.
                            ; and if the value of mutex was loaded, it means it wasn't equal to zero
                            ; by the definition of the CMPXCHG instruction.
                            ; zf will have been set in that case, so we can just make a conditional jump

Now, notice how we used "lock" before using cmpxchg? This is because we want the CPU to lock the bus before doing the operation so that no other CPU will interfere with that memory location.

WakeupCall server using resiprocateLast edited on Jun 14, 2012

This is my first project I did with the resiprocate SIP stack. There's a lot of things left to do in this project but I wanted to post the code here right away in case someone needs more example on how to use resiprocate.

Dependencies and limitations

I chose to use resiprocate as the SIP stack and ortp as the RTP stack and libxml2 and the XML parser. The application only supports G.711 uLaw. The application only supports SIP info for receiving DTMF (inband and RFC2833 not supported).


Resiprocate provides a Dialog Usage Manager (DUM). This engine is very useful for applications that don't want to deal with low level SIP messages. The DUM allows you to receive events such as onOffered, onAnswer, onTerminated (plus many more) by the use of an observer pattern. Using a class called AppDialogSet, it is possible to represent a "call" or a "dialog" and let the DUM manage it. For example, you could override the AppDialogSetFactory with your own CallFactory that would create "Call" objects derived from AppDialogSet. When receiving an event such as onOffered, the DUM will already have created a AppDialogSet with your factory class and you can then cast this AppDialogSet with your "Call". This is a good way to receive a "Call" reference on every events you get. And the beauty of this is that you never need to delete it becausr the DUM will take care of it. More information is available on the resiprocate website.


ortp is very easy to use but only provides basic functionalities. It won't bind to any sound cards or include encoding like other fancy stack do. This stack only allows you to open a stream and feed it data encoded with whatever codec you want. It is the developper's responsibility to make sure that the data that is fed is encoded with the proper codec.

Threading model

I chose to use 1 thread for general processing and 1 thread for each RTP session. The main thread is used to give cycles to the resiprocate DUM and to the WakeupCallService. A new thread is created for each RTP sessions. The RTP session only handles outgoing stream since we don't need the incomming stream. The ortp stack provides a way to read multiple streams from the same thread but I prefer to use different threads in order to leverage multi-cores CPUs.


The server is a user agent that registers with you PBX. Just call the server and enter the time at wich you want your wakeup call and the extension at which you wanna be notified. For example, you would enter 0,6,3,0 to get a wakeup call at 6h30 AM. I left out the prompts from the package so you'll want to replace them. The IVR is defined in the xml file. Just change the prompt names. There is no configuration file you can use right now. You will need to set the proper values that you need in config.h. To launch the application, run it and provide, as a command line argument, the ip address on which to bind on your computer.


Download the source code

FFT on AMD64Last edited on Jun 5, 2012

Fast Fourier Transform with x86-64 assembly language

This is an old application I did a while ago. I did this in 2005 when I got my first 64bit CPU (AMD). The first I did after installing my new CPU was to open VI and start coding an FFT using 64 bit registers. This is old news, but 64 bit at that time was awesome. Not only can you store 64 bits in a register, but you get 32 general purpose registers!

The only really annoying thing with this architecture is that they don't provide a bit reveral instruction. I don't understand why a simple RISC processor like the AVR32 (lookup "brev") has one but not a high end CISC like Intel or AMD. I don't actually show the bit reveral part of the FFT in here though.

By the way, I remember doing some tests with this algorithm and, although I don't remember the results exactly (7 years ago), I remember that it was running at least 5 times faster than most other FFTs in other libraries.

//; x8664realfft(float* source,float** spectrum,long size)
        mov     	$1,%eax
        cvtsi2ss     %eax,%xmm10
        pshufd  	$0b00000000,%xmm10,%xmm10
        mov     	$-1,%eax
        cvtsi2ss     %eax,%xmm10
        pshufd  	$0b11000100,%xmm10,%xmm10
        jmp     	fftentry
	mov		$1,%eax
	cvtsi2ss	%eax,%xmm10
	pshufd	$0b00000000,%xmm10,%xmm10
        pushq   	%rbp
	movq    	%rsp,%rbp
	pushq	%rbp
	subq		$0xFF,%rsp
	movq	%rsp,%rbp
	//; make a 16bytes aligned buffer
	addq		$16,%rbp
	andq		$0xFFFFFFFFFFFFFFF0,%rbp

	pushq	%r15
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%r11
	pushq	%r10
	pushq	%r9
	pushq	%r8

        //; rcx = size
        movq    	%rdx,%rcx  				
        pushq	%rcx
	//; rdx = source 
	mov		%rdi,%rdx				
	pushq		%rdx

	//; rdi = spectrum[0]
	movq	(%rsi), %rdi			
	addq		$8, %rsi
	//; rsi = spectrum[1]
	movq	(%rsi), %rsi			

	//; r8 = log2(N), r14= N
	pushq	%rcx
	fild		(%rsp)
	xorq		%r8,%r8
	pushq	%r8
	fistp		(%rsp)
	popq		%r8
	popq		%r14	
	//; bit reversal has already been done prior to calling this function
	//; r9 = nLargeSpectrum
	//; r10 = nPointsLargeSpectrum
	movq	%r14,%r9
	movq	$1,%r10
	movq	$1,%r11
	mov	%rdi,%r14
	mov	%rsi,%r15
	//;load 2PI in st(0)
	faddp	%st(0),%st(1)
	movq	%r8,%rcx

l1:	pushq	%rcx
	shrq	$1,%r9
	shlq	$1,%r10
	//;st(0) = theta, st(1) = 2pi
	fld	%st(0)
	pushq	%r10
	fidiv	(%rsp)
	popq	%r10

	//;xmm0 = 2*costheta[0],2*costheta[0],2*costheta[0],2*costheta[0]
	//;  st(0) = theta, st(1) = 2pi
	pushq	%rax
	fld	%st(0)
	fstp	(%rsp)
	movss	(%rsp),%xmm0
	pshufd	$0b00000000,%xmm0,%xmm0
	popq	%rax
	addps	%xmm0,%xmm0
	movq	%r9,%rcx
l2:	pushq	%rcx
	//; r12 = point1 (index *4bytes)    r13 = point2 (index *4bytes)
	movq	%r10,%r12
	movq	%r9,%rax
	subq	%rcx,%rax
	pushq	%rdx
	mulq	%r12
	popq	%rdx
	movq	%rax,%r12
	movq	%r11,%r13
	addq	%r12,%r13
	shlq	$2,%r13
	shlq	$2,%r12

	//; xmm2 = costheta[2],sintheta[2],costheta[1],sintheta[1]  
	movq	%r12,16(%rbp)
	decq		16(%rbp)
	fld		%st(0)
	fimul		16(%rbp)
	fstp		(%rbp)
	fstp		4(%rbp)
	decq		16(%rbp)
	fld		%st(0)
	fimul		16(%rbp)
	fstp		8(%rbp)
	fstp		12(%rbp)
	movaps	(%rbp),%xmm2
	pshufd	$0b10110001 ,%xmm2,%xmm2
	//;xmm1 = costheta[1],sintheta[1],0,0
	movhlps	%xmm2,%xmm1
	movq	%r11,%rcx
	//; recurrence formula
	//; xmm3 = w.re,w.im,w.re,w.im
	movaps	%xmm2,%xmm3
	mulps	%xmm0,%xmm3
	subps	%xmm1,%xmm3
	movlhps	%xmm3,%xmm3
	movaps	%xmm2,%xmm1
	movaps	%xmm3,%xmm2
	mulps	%xmm10,%xmm3
	//; xmm5 := c.im,c.re,c.re,c.im
	movq	%r14,%rdi
	movq	%r15,%rsi
	addq		%r13,%rdi
	addq		%r13,%rsi
	movss	(%rdi),%xmm5
	pshufd	$0b00000011,%xmm5,%xmm5
	addss	(%rsi),%xmm5
	pshufd	$0b00101000,%xmm5,%xmm5
	//; xmm3 := inner product: re,re,im,im
	mulps	%xmm3,%xmm5
	pshufd	$0b11011101 ,%xmm5,%xmm3
	pshufd	$0b10001000 ,%xmm5,%xmm5
	addsubps	%xmm5,%xmm3
	pshufd	$0b10101111,%xmm3,%xmm3
	//;xmm6 := sortedArray[point1].re,sortedArray[point1].re,sortedArray[point1].im,sortedArray[point1].im
	movq	%r14,%rdi
	movq	%r15,%rsi
	addq	%r12,%rdi
	addq	%r12,%rsi
	movss	(%rdi),%xmm6
	pshufd	$0b00001111,%xmm6,%xmm6
	addss	(%rsi),%xmm6
	pshufd	$0b11100000,%xmm6,%xmm6
	addsubps	%xmm3,%xmm6
	pshufd	$0b00100111,%xmm6,%xmm6
	movss	%xmm6,(%rdi)
	pshufd	$0b11100001,%xmm6,%xmm6
	movss	%xmm6,(%rsi)
	movq	%r14,%rdi
	movq	%r15,%rsi
	addq	%r13,%rdi
	addq	%r13,%rsi
	pshufd	$0b01001110,%xmm6,%xmm6
	movss	%xmm6,(%rdi)
	pshufd	$0b11100001,%xmm6,%xmm6
	movss	%xmm6,(%rsi)
	//; increase point1 and point2 by 4 bytes (each index represent a float)
	addq		$4,%r12
	addq		$4,%r13
	decq		%rcx
	jnz		l3
	popq		%rcx
	decq		%rcx
	jnz		l2

	//; remove theta from fpu stack
	fstp		%st(0)
	shlq		$1,%r11
	popq		%rcx
	decq		%rcx
	jnz		l1

	popq	%rdx
	//; rcx is already pushed in stack
	cvtsi2ss      (%rsp),%xmm1
	pshufd  	$0b00000000,%xmm1,%xmm1
	popq		%rcx
	shrq          $2,%rcx
	movq	%r14,%rdi
	movq	%r15,%rsi

	//; is this a ifft or a fft?
	cvtss2si	%xmm10,%eax
	cmp	$-1,%eax
	jne	nrm

cp:	movaps	(%rdi),%xmm2
	movntdq	%xmm2,(%rdx)
	addq	$16,%rdi
	addq	$16,%rdx
	loop	cp
	jmp	cleanexit

	movaps	        (%rdi),%xmm2
	movaps	        (%rsi),%xmm3
	divps		%xmm1,%xmm2
	divps		%xmm1,%xmm3
	movntdq	        %xmm2,(%rdi)
	movntdq	        %xmm3,(%rsi)
	addq		$16,%rdi
	addq		$16,%rsi
	loop		nrm

	fstp		%st(0)
	popq		%r8
	popq		%r9
	popq		%r10
	popq		%r11
	popq		%r12
	popq		%r13
	popq		%r14
	popq		%r15
	addq		$0xFF,%rsp	
	popq		%rbp