WWW.DUMAIS.IO
ARTICLES
OVERLAY NETWORKS WITH MY SDN CONTROLLERSIMPLE LEARNING SWITCH WITH OPENFLOWINSTALLING KUBERNETES MANUALLYWRITING A HYPERVISOR WITH INTEL VT-X CREATING YOUR OWN LINUX CONTAINERSVIRTIO DRIVER IMPLEMENTATIONNETWORKING IN MY OSESP8266 BASED IRRIGATION CONTROLLERLED STRIP CONTROLLER USING ESP8266.OPENVSWITCH ON SLACKWARESHA256 ASSEMBLY IMPLEMENTATIONPROCESS CONTEXT ID AND THE TLBTHREAD MANAGEMENT IN MY HOBBY OSENABLING MULTI-PROCESSORS IN MY HOBBY OSNEW HOME AUTOMATION SYSTEMINSTALLING AND USING DOCKER ON SLACKWARESYSTEM ON A CHIP EMULATORUSING JSSIP AND ASTERISK TO MAKE A WEBPHONEC++ WEBSOCKET SERVERSIP ATTACK BANNINGBLOCK CACHING AND WRITEBACKBEAGLEBONE BLACK BARE METAL DEVELOPEMENTARM BARE METAL DEVELOPMENTUSING EPOLLMEMORY PAGINGIMPLEMENTING HTTP DIGEST AUTHENTICATIONSTACK FRAME AND THE RED ZONE (X86_64)AVX/SSE AND CONTEXT SWITCHINGHOW TO ANSWER A QUESTION THE SMART WAY.REALTEK 8139 NETWORK CARD DRIVERREST INTERFACE ENGINECISCO 1760 AS AN FXS GATEWAYHOME AUTOMATION SYSTEMEZFLORA IRRIGATION SYSTEMSUMP PUMP MONITORINGBUILDING A HOSTED MAILSERVER SERVICEI AM NOW HOSTING MY OWN DNS AND MAIL SERVERS ON AMAZON EC2DEPLOYING A LAYER3 SWITCH ON MY NETWORKACD SERVER WITH RESIPROCATEC++ JSON LIBRARYIMPLEMENTING YOUR OWN MUTEX WITH CMPXCHGWAKEUPCALL SERVER USING RESIPROCATEFFT ON AMD64CLONING A HARD DRIVECONFIGURING AND USING KVM-QEMUUSING COUCHDBINSTALLING COUCHDB ON SLACKWARENGW100 MY OS AND EDXS/LSENGW100 - MY OSASTERISK FILTER APPLICATIONCISCO ROUTER CONFIGURATIONAASTRA 411 XML APPLICATIONSPA941 PHONEBOOKSPEEDTOUCH 780 DOCUMENTATIONAASTRA CONTACT LIST XML APPLICATIONAVR32 OS FOR NGW100ASTERISK SOUND INJECTION APPLICATIONNGW100 - DIFFERENT PROBLEMS AND SOLUTIONSAASTRA PRIME RATE XML APPLICATIONSPEEDTOUCH 780 CONFIGURATIONUSING COUCHDB WITH PHPAVR32 ASSEMBLY TIPAP7000 AND NGW100 ARCHITECTUREAASTRA WEATHER XML APPLICATIONNGW100 - GETTING STARTEDAASTRA ALI XML APPLICATION

C++ WEBSOCKET SERVER

2015-04-01

I recently wanted to learn a bit more about websockets. And by that, I don't mean how to use websockets from javascript but rather how the server part works and what the protocol looks like. So I decided to write my own server library. I followed RFC6455 but there is still some things I need to change in order to be fully compliant. There's not much to say about the library other than it is very easy to use. I did try libwebsocket before. It is pretty complete but I felt like it was a little more complicated than it should. So although my library is not as complete as libwebsocket, it is easier to use and will be good enough for most of my projects.

My code is hosted on github: https://github.com/pdumais/websocket

SIP ATTACK BANNING

2015-03-30

New and improved version

I wrote this article a few years ago and posted a c++ application I wrote for automatically invoking iptables to hosts that are abusing my Asterisk server.

I rewrote the application, but this time using Perl. I use Net::Pcap to sniff on the network. The script runs as a daemon and looks for traffic going out of the LAN. It filters SIP responses and will automatically invoke iptables to block hosts to which it sees asterisk sending more than 10 (configurable) responses higher or equal than 400 to a remote host. Only responses sent for REGISTER and INVITE are filtered.

Script

You will find the script on github https://github.com/pdumais/astban

BLOCK CACHING AND WRITEBACK

2014-12-19

I recently wrote a disk driver for my x86-64 OS. I also wrote a block caching mechanism with delayed writeback to disk

Block Caching

reading/writing blocks is at a layer under the filesystem. so there is no notion of available/used blocks. This layer only reads/writes and caches blocks.

Reading a block

when a read request is executed, the requested block is first searched for in the cache. If a block is already cached, that data is returned. If the block does not exist in the cache, a new cache entry is created and is marked as "pending read". The new cache entry is associated with the device and block number that is requested. The request will then block the current thread until the block is gets its "pending read" status cleared. This will be done by the IRQ. When a new block needs to be created, it is done atomically so that only one block at a time can be created. That mechanism will prevent two similar read access that occur at the same time to issue 2 read requests.

When a block is read from disk, it is kept in the cache. Everytime it is accessed, a timestamp is update to keep track of the latest access.

Scheduling

A function called schedule_io() is called at the following times:

  • At the end of a disk IRQ.
  • After a cache entry is marked "pending read" and the disk driver is not busy (so no pending operation would trigger an IRQ)

The schedule_io() function iterates through the list of cache entries and finds an entry that is "pending read" and then requests the disk driver to read the sector from disk. Several different algorithms can be used in this function to make schedule_io() choose which "pending read" entry to use. A common algorithm is the "elevator" algorithm where the scheduler will choose to execute a read operation for a sector that is the closest to the last one read. This is limit seeking on disk. An elevator that needs to go to floors 5,2,8,4 will not jump around all those floors. If the elevator is currently at floor 3, it will do: 2,4,5,8.

That is not the algorithm I chose to implement though. To keep it simple (and tgus very inneficient), my scheduler just picks the first "pending read" entry it sees in the block cache list. When there is no more read requests, the scheduler proceeds with write requests. So read requests will always have higher priority. This is good for speed, but bad for reliability of data persistance.

Updating a block

when data needs to be written to an existing block, the block could be loaded in memory previously. This means that it was either read earlier for some other reasons or it was read and a small portion of it was updated. Either way, it is already in the cache and it needs to be written back to disk. In that case, the "pending write" flag will be set on it and when the scheduler picks it up, it will send a write request to the disk driver.

The following scenarios could occur:

  • Trying to read while write pending
    It doesn't matter. The block will be be read directly (from memory). This could happen after writing into a block and reading it right away. You would want the updated version.
  • Trying to write a block that does not exist yet in the cache
    This means that the block was never read and we just wanna overwrite whatever is in it. A cache entry will be created for the block and data will be copied in it. The Write pending" flag will be set
  • Trying to update while write pending
    This call would need to block until the block is finished writing back on disk. because we want to avoid updating in the middle of write

Block cache list

To keep things simple (and again very inneficient), I chose to implement the block cache list as a fixed-size array. A better approach would be to store the entries in a tree and let it grow as long as there is available memory.

Each cache entry is as follows:

#define CACHE_WRITE_PENDING 1 #define CACHE_READ_PENDING 2 #define CACHE_BLOCK_VALID 4 #define CACHE_IN_USE 8 #define CACHE_FILL_PENDING 16 struct block_cache_entry { unsigned long block; char *data; unsigned char device; volatile unsigned char flags; unsigned long lastAccess; } __attribute((packed))__;

Each entry has a field to determine the sector number on disk and the device number on which the sector belongs. lastAccess is used for the cache purging alorithm. The flags field is a combination of the following bits:

  • CACHE_WRITE_PENDING: The block does contain valid data but is not flushed to disk yet, but it should be.
  • CACHE_READ_PENDING: The block does not contain data yet and is waiting for a disk read operation to fill it
  • CACHE_BLOCK_VALID: The entry is valid. If 0, the entry is invalid and is free to use for caching. if 1, it contains valid data that belongs to a sector on disk.
  • CACHE_IN_USE: The entry is in use by the cache subsystem and should be be purged.
  • CACHE_FILL_PENDING: The entry was created to a write operation but does not contain data yet. So it cannot be read nor flushed to disk, but it should not be purged either.

Clearing cached block

when there is no space left in the cache block list (in my case, because the fixed-size array is full, but when the tree cannot grow anymore for the tree version), cached blocks must cleared. The block cache will find the blocks with oldest access time and that are not pending write or read and will free them. obviously, this is a very simple algorithm that does not take into account the frequency of access, or probability of access given the location on disk. But it works.

ATA driver

Just for reference, here is a sample of the disk driver. The full source code can be download at the end of this article, but I will show a portion of it here anyway



Download

block_cache.c
block_cache.h
ata.c (the disk driver)

BEAGLEBONE BLACK BARE METAL DEVELOPEMENT

2014-12-08

Not so long ago, I wrote a small OS prototype for the Cortex-A8 CPU. I was using qemu but now I wanted to play with a real device. So I decided to give it a shot with my BeagleBone Black.

Booting

The beaglebone black's AM3359 chip has an internal ROM (located at 0x40000000) that contains boot code. That boot code will attempt to boot from many sources but I am only interested in the eMMC booting. The boot code will expect the eMMC to be partitioned and that the first partition is FAT32. I don't know if there is anyway to just use the eMMC as raw memory and have the AM3359 boot code to just load whatever is at the bottom of the flash without any partitions, so I will live with the FAT32 concept. I want to use u-boot because I want to be able to update my kernel with tftp. The stock BBB will have the eMMC formatted with a FAT32 partition with uboot on it. I will make a u-boot script that downloads my kernel from the tftp server, copy it in flash memory and then have u-boot load that kernel from flash memory into RAM. That last step is not necessary but I want to do it because at a later point in time, I will remove the tftp part from the u-boot script and only have the kernel in flash be loaded in RAM.

The proper way to do this, would be to store the kernel file and all of my application files in the ext2 partition that is already present in the eMMC. But then, I would need a EXT2 driver in my kernel so that it could load the application files from the flash. I don't wanna bother writing a ext2 driver for now so I will hack my way though this instead. So instead of getting uboot to download the kernel and applications in a eMMC partition, I will get it to write the kernel at a fixed location (0x04000000) in the eMMC. This will most probably overwrite a part of the 1st or second partition but I really don't care at that point. As long as I don't overwrite the partition table and the begining of the FAT32 partition where u-boot sits. Then all applications will be written one after the other just after the kernel in a linked-list style.

According to section 2.1 of the TI reference manual for the AM335x, the ROM starts at 0x40000000. But then, in section 26.1.3.1, they say that the ROM starts at 0x20000. This is very confusing. It turns out that when booting, memory location 0x40000000 is aliased to 0x00000000. The CPU starts executing there, and some ROM code jumps to the "public ROM code". The public ROM code starts at ROM_BASE+0x20000. Since memory is aliased, 0x200000 is the same as 0x40020000. Section 26.1.4.2 says that the ROM code relocates the interrupt vector to 0x200000, probably using CP15 register c12. When the ROM code finds the x-loader (MLO) in flash memory, it loads it in SRAM at 0x402F0400. At this point, system behavior is defined by u-boot (MLO was built with u-boot). What was confusing me at first was that I thought that the eMMC mapped to 0x00000000. Turns out that this memory is not directly addressable. So if I need to retrieve my applications from eMMC, I will need to write a eMMC driver because the eMMC is only accessible through EMMC1. Now that I understand how eMMC works, I realize that it was foolish of me to think that it could be directly addressable. The MMC1 peripheral will allow you to communicate with the on-board eMMC but you still need to write your own code to interface it using the SD/MMC protocol. I had a really hard time finding information on how to read the eMMC. The TI documentation is good at explaining how to use the MMC controller but they don't explain how to actually communicate with the eMMC. And that's normal since the eMMC is board dependant. The eMMC is accessible through MMC1. The TI documentation explains how to initialize the device but since we know that the board contains eMMC, we don't have to go through all the trouble of detecting card types etc. I was really surprised of how it was hard to find good documentation on how to use the MMC/SD protocol. I can't really explain what I did, all I know is that it works, and the code will definitely not be portable to another board. I read the TRM and also looked at another source code and trought trial and error, I was able to read the eMMC. The file emmc.S in my source code is pretty easy to understand. I was not able to send the proper command to set the device in "block addressing mode" and to change the bus width. Like I said, this information is kinda hard to find. I'll have to do a lot more researching to make this work.

I want uboot to download my kernel from tftp and load it in memory. There doesn't seem to be any easy way to do this. I couldn't find a way to install uboot on my BBB without installing a full eMMC image containing linux. So I decided to just use the stock eMMC image but modify uboot to boot my kernel instead of the installed linux. But it seems that changing the environment variable "bootcmd" is impossible from uboot on the BBB. But there is the uenv.txt file residing on the FAT partition that I can change to contain my own script to download my kernel. Well, that to is impossible to modify directly from uboot.

So I ended creating an SD card with an angstrom image, boot from the SD card, mount the eMMC FAT32 partition and edit the uenv.txt file. I modifed it to look like this:

uenvcmd=set ipaddr 192.168.1.37;set serverip 192.168.1.3;tftp 0x80000000 os.bin;tftp 0x90000000 apps.bin;mmc dev 1;mmc write 0x90000000 0x28000 0x100;go 0x80000000

Now everytime I want to update the uenv.txt file, I need to boot from the SD card because I am destroying the 2nd partition on the eMMC with my kernel since I use raw writing on the eMMC. This is not a nice solution but it works for now

Software IRQ

The software IRQs on the BBB work in a completely different way than the realview-pb-8 board. On the BBB, software IRQs are not dedicated IRQs. You get a register that allows you to trigger an IRQ that is tied to a hardware IRQ already. So You can only use software IRQ to fake a hardware IRQ. This means that you could send a software IRQ 95 but that would be the same as if you would get a timer7 IRQ. You actually need to unmask IRQ 95 for this to work, but unmasking IRQ 95 will also allow you to get TIMER7 IRQs. In my case, this is excellent. Because my timer7 IRQ calls my scheduler code. So a Yield() function would just trigger that IRQ artificially using the software IRQ register.

User-mode handling of IRQs

User-mode threads can register interrupt handlers in order to be notified when GPIO is triggered. The way this works is that whenever an interrupt is received, if a user-handler is defined, then the page table is changed to the page table base address of the thread that is interested in receiving the event. Then, a jump to the handler is done. So the CPU stays in IRQ mode, but the page table is changed and the user-mode handler is executed in IRQ mode.

The code

There is a lot more I could describe in here but the source code might a better source of documentation. Basically, other things I have accomplished is:

  • AM3358 interrupt controller
  • AM3358 timer
  • SPI driver for a port expander (MCP23S18) and for an EEPROM chip (25aa256)
  • Pin muxing
  • GPIO (output and input with interrupts)
  • sending data on more than one UART.

https://github.com/pdumais/bbbos