Fuck you! Posted on 22 May 2012 at 00:59
C:\>ls 'ls' is not recognized as an internal or external command, operable program or batch file.

Fuck you.

Full fbembed integration into .NET applications Posted on 21 May 2012 at 04:12

Or as I would call it: going full retard.

In my last article, I provided static builds for the fbembed.dll library, even including the ADO.NET driver. As I am a fan of no-setup, single-EXE deployment, this was a good step forward, because it eliminated the dependencies to the ICU and MSVCRT libraries. However, there is a problem: as ILMerge doesn't work for WPF applications anymore, I use the Costura library to merge assemblies into one executable. It basically works by intercepting the exception when an assembly cannot be resolved, and instead loads the assembly from a resource stream and thus from memory. This is favorable, because without temporary file extraction, no problems will arise if multiple instances, even with multiple versions of the assembly, are active.

Although the second of the provided DLLs is a managed assembly, it cannot be loaded from memory, because native code has to be mapped into process space, and thus physically reside on the file system (there is an exception, but I doubt it'll play well with DllImport). This was a major problem, because without extraction of the library, I could not load the native code in it.

But there is one guaranteed place for executable code which is always available: the executable itself. There is no mayor difference between an EXE and a DLL. Both are PEs (portable executable), both can export symbols, and both can dynamically and statically link to code. So the only way to embed the Firebird engine into a .NET application was to directly embed the native code into the application itself.

First try: C++/CLI wrapper

My first idea was to write a FirebirdSql.Data.Client.Native.IFbClient implementation, which, instead of doing lots of interop calls, being a C++/CLI class, that directly calls into the statically linked fbembed.lib. I produced a little test case, which worked, so I got into implementing the wrapper class. However, turns out, C++/CLI projects cannot be statically linked against the MSVCRT libraries (/MT switch is incompatible with /CLR), which was the whole point. It's also a mayor PITA to write all the interop code yourself. So this proved a dead end. Every API change would also mean changing the wrapper code.

Second try: linking everything together

As the ADO.NET driver allows arbitrary names for the client library, my idea was to link my application against the fbembed.lib with a .def file declaring all the exported symbols. This has a two-fold advantage, because the .def file will warn you about missing symbols, and on the other hand, force all the relevant code into the result.

It's ugly, and I will describe its ugliness below, but the first test failed. As all the linking didn't yield any .pdb files, debugging was impossible, so it really was a guessing game. In the end, I came to two conclusions:

  • If you create an executable which is linked against MSVCRT, the runtime library will do some initialization, like initializing global and static variables, preparing the heap, etc., which is a transparent process.
  • If you create a DLL, the moment you call LoadLibrary, the runtime library will basically do the same initialization, much more transparent.

In either case, the modules will fail if the initialization has not been done. Because we are treating a DLL like an EXE, the function that does the initialization is called DllMain. This is called once per load/unload, and each time a thread attaches. But DllMain doesn't initialize the runtime library. The symbol that does that is _DllMainCRTStartup. So the whole process is the following.

  1. Implement a "jump start" for the MSVCRT code you will later link in
  2. Compile your .NET/WinForms/WPF project as usual
  3. Disassemble the assembly, remove the assembly manifest part from the IL code, and recompile as a .netmodule
  4. Link all necessary libraries, including Windows libraries, fbembed, all its dependencies, the static MSVCRT (libcmt.lib), your native .RES file and .NET resources together, with a .def file not only containing the fbclient/fbembed symbols, but also _DllMainCRTStartup

You need to use the Visual Studio 2008 tools, because 2010 will only yield 4.0 .NET executables. In my case, I want 3.5, which uses the 2.0 runtime. The jump start code looks like this:

public enum ReasonForCall : uint { DLL_PROCESS_ATTACH = 1, DLL_PROCESS_DETACH = 0, DLL_THREAD_ATTACH = 2, DLL_THREAD_DETACH = 3 } [DllImport("kernel32.dll", CharSet = CharSet.Auto)] private static extern IntPtr GetModuleHandle(IntPtr module); [DllImport("YourAssembly.exe")] private static extern uint _DllMainCRTStartup(IntPtr hModule, ReasonForCall ulReasonForCall, IntPtr lpReserved); [...] _DllMainCRTStartup(GetModuleHandle(IntPtr.Zero), ReasonForCall.DLL_PROCESS_ATTACH, IntPtr.Zero);

You have to specify your connection string to include the not-so-common client library:

string connectionString = "ServerType=1;User=SYSDBA;Password=masterkey;Dialect=3;Charset=UTF8;Database=DATA.FDB;client library=YourAssembly.exe";

The ADO.NET driver will load/create databases and work as usual. Later on, I will create a small executable that does the necessary steps, because currently manual modification of the IL code is necessary. It's also important to know which libraries are required and how to link them. For now, it's possible, a fully functional WPF application which creates/opens a Firebird database and executes queries against it, without external dependencies besides the .NET Framework itself.

Static fbembed builds Posted on 19 May 2012 at 23:52

Currently, fbembed.dll, the embedded database engine for Firebird has the following external dependencies:

  • MSVCR80.DLL
  • ICUUC30.DLL

ICUXX are the Unicode Components. By modifying the build process, a static library can be produced instead of a dynamic library, which then can be linked against. MSVCRXX.dll is the dependency created by the compiler itself. I offer the following static builds for Firebird with no external dependencies besides the Windows internal libraries:

The second library not only contains the fbembed-library, but also the current ADO.NET driver, so it can directly be referenced in Visual Studio .NET projects. A connection can be opened with the following command:

string connectionString = "ServerType=1;User=SYSDBA;Password=masterkey;Dialect=3;Charset=UTF8;Database=C:\\TEST.FDB"; FbConnection.CreateDatabase(connectionString, 4096, true, true); using (FbConnection connection = new FbConnection(connectionString)) { connection.Open(); [...] }

Please note that due to the nature of the .NET assembly containing unmanaged code, certain features are not available, notable x64 execution and loading the assembly from a resource. I'm currently working on a way to embed the Firebird functionality into a single file without the need to temporarily extract the DLL. At least you don't need to bother about the Unicode libraries and MSVCRT.

First impressions of the Parrot AR.Drone 2.0 Posted on 02 May 2012 at 23:48

I got my brand new AR.Drone 2.0 today. First bummer: the Android app is not yet available. As I have a HTC Desire Z, and no iPhone or iPad, I cannot use the full potential of the new drone. However, the old app is still usable: you don't get to see the video of the new 720p camera, and the absolute positioning feature doesn't yet work. The missing video isn't a real problem, because as a beginner, you really have to fly with the drone in range. The absolute positioning feature would certainly be useful, but again, beginners should start slowly.

Battery

At first, the battery didn't accept the charging. The charger kept blinking red, without the battery being loaded. I'm not sure what the problem is, but it's probably the cells being unbalanced. But one thing is for sure: the battery has much less capacity than what have could been fit into the size of it (1000 mAh instead of 2200 mAh at only 50 grams more). The loader is also awfully slow; the manual states 1.5 hours of loading, which presumably gives you 12 minute of flight time, which I estimate to about 9 minutes hovering, and much less when moving around. This basically isn't enough to even train. As soon as you get a grip on the controls, the battery levels drop to critical levels and the drone shuts down.

Flying

Flying is fun, but even as the drone is a quadro-copter, it takes some practice. For a beginner like me, as soon as the orientation changes from facing away from you, my control goes havoc. Fortunately the indoor hull provides protection. I strongly advise against flying without it indoors. When it hits a wall or some other object, the hull flexes, blocking a rotor and thus initiating an emergency shutdown. As the rotors are made of flexible plastic, no real damage happens. But keep your glue ready (I use Pattex Repair Extreme) to repair damages to the hull itself, which is made of foam plastic. I already broke a part, which was easy to fix with the glue.

I disabled the acceleration sensor based controlling, because it introduced another variable into the whole flying. It certainly is cool to control the drone by tilting your phone (for anyone not familiar with the AR.Drone: it connects via Wi-Fi to your Android or iPhone smartphone, which makes it really versatile), but in my experience, the tilting sensors in recent phones are a bit erratic and non-predictable. Instead you get two areas, the left one, where you control tilting and thus flight direction, and the right one, where you control height and rotation.

The main issue is the short operating time and the long wait for the battery to load again. I ordered some aftermarket batteries with more than double the capacity and a professional charger, which is able to charge the battery in about the same time which it takes you to discharge them. Even with all the built-in stabilization, controlling the drone is not easy and requires some training. Hopefully the additional batteries will allow that, so that I can someday use it outside and at heights more than 2 meters. I also ordered some bearings because I noticed that the rotors have much more clearance than they should have. I don't know if that makes much difference, but having the rotors sturdy seams like a reasonable precondition for controlled flight.

General thoughts

The drone being controlled with Wi-Fi is certainly an ingenious way, because it doesn't require a dedicated remote control. It also provides a return channel for the video feed. On the other side, the lack of a new Android app is really a let down, because by now they could have at least adapted the old one to accommodate the new video format. Wi-Fi also limits the useful range, but then again, a high-bandwidth connection which is required to transmit the video will always have some serious range limitations, or require dedicated directional antennas. I'm not yet sure if the AR.Drone is a real model airplane, or just a silly toy for people with too much money. I have yet to test the camera, and I don't expect cinema quality, but 720p30 at least sounds promising. Recording is done to a USB stick which you can directly put into your drone, or - at lower quality - directly to your phone.

Using VirtualBox, I set up a Linux server with the intention of hosting multiple virtual machine instances in order to make as much use of the server as possible. Networking proved to be a big problem. The ISP allocates up to 8 eight public IP addresses for you, so each machine can have it's own address. There are different ways to make the virtual machines available to the public internet.

Bridged network

This basically works like a virtual Ethernet switch that connects your virtual instances and one designated physical network card together. Each host has a different MAC address and behaves like a distinct device on the network. Herein lies the problem: my ISPs registers the MAC address of the physical network adapter with it's network switches, and as soon as the switch sees an Ethernet frame coming from my port with a different MAC address, it will not only discard that frame, but also shutdown the port completely and permanently. This is to avoid ARP spoofing, and is a reasonable way. Anyway, whatever you do, DO NOT connect VirtualBox instances in bridged mode to the public network adapter. This only works in normal Ethernet setups, and not in data centers with managed switches and monitoring against ARP spoofing. I got my networking back after a call to support, but because the port was completely disabled, I could only access the server with a serial terminal for which the ISP provides a SSH access. Without this way to disable the bridged mode configuration, the minute the port would have been re-enabled, the ARP spoofing detection would have kicked in again and disabled the port another time.

Bridged network №2

The setup is basically the same, the instances are all in bridged mode, connected to the public network adapter, but this time, each machine gets the same MAC address as the physical adapter of the hypervisor. This kind of works, because the ISP's switch never sees a foreign MAC address, and the internal VirtualBox architecture simply distributes all incoming frames to all connected virtual instances. However, it still creates some problems:

  • All machines virtual network adapters (btw. virtio) are basically permanently in promiscuous mode. Although only IP packets destined for a particular machine are really picked up, firewalls permanently log dropped packets because of the wrong destination address. It also opens up a security hole and reduces performance by having all machines inspect traffic for all the other machines, even the hypervisor itself.
  • Connections between the machines were really slow. I never exactly figured out why, but because the whole setup was so ridiculous (there has to be a reason why each network card usually gets it's unique MAC address), I didn't bother with it. There was also a lot of spurious ARP traffic, probably because the machines couldn't agree on what MAC address belonged to what IP.

NAT

This is the default mode when creating a new VirtualBox instance. I never bothered using it, because in tests it was awfully slow, so slow actually that is was unusable (this may have changed). It also creates a lot of problems. The basic setup would be to have the hypervisor get all the public IPs, which creates the first problem, as to which IP the hypervisor itself should use, and which are reserved for your virtual machines. The virtual instances then get private IP addresses on a host-only adapter, and each additional public IP is then forwarded to a private one. While this probably works, and the speed issue could be resolved by using iptables instead of the VirtualBox built-in NAT mode, an "identity problem" remains because none of the virtual instances really know their real public address. This will seriously sabotage protocols like FTP that need to know their own public IP address.

Proxy ARP

After a lot of research, I found a way that actually works and behaves well. Each virtual instance is configured with it's public IP address and correct DNS and gateway settings, as if it were directly connected to the external network. Also each virtual network adapter has it's own unique MAC address. The only difference is that they are connected to a host-only, internal network adapter which can be created with VirtualBox.

Now the machines can talk to each other, but not with the internet, and they cannot be reached from the outside. This is because no one knows they are actually there. To mitigate this, the almost ancient proxy_arp is enabled. It basically works as an ethernet bridge, which allows one host to impersonate the Ethernet interface of another. Enabling it is simple:

echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp echo 1 > /proc/sys/net/ipv4/conf/vboxnet0/proxy_arp

Be advised that the host-only networking adapter only gets enabled after at least one virtual instance has been started and connected to the network. Instead you can force it up with a simple command, which makes it easier to put everything together into a boot script:

ifconfig vboxnet0 [HYPERVISOR_IP] netmask [YOUR_NETMASK] up

Replace [HYPERVISOR_IP] with your primary public IP address of your hypervisor, and accordingly it's netmask, which often will be 255.255.255.255.

After issuing this command, the hypervisor will not only answer ARP requests for it's own MAC address, but also forward the requests and relay the answers back to the source adapter where it received the request. This way the external router knows that it can reach the additional public addresses on the physical network adapter of the hypervisor.

The only thing left to do is enabling routing and adding routes so that IP packets actually get forwarded in both directions. You might also need to tweak your iptables configuration to allow traffic through, as the hypervisor now acts as a transparent Ethernet bridge and a stateful firewall at the same time.

route add [FIRST_VIRTUAL_IP] vboxnet0 route add [SECOND_VIRTUAL_IP] vboxnet0 route add [THIRD_VIRTUAL_IP] vboxnet0 echo 1 > /proc/sys/net/ipv4/ip_forward

As usual, replace the [...] with your actual IPs and add as many routes as you have IP addresses allocated for virtual machines. As an optional step you can configure each virtual instance with direct routes as to avoid the roundtrip to the ISPs gateway for internal traffic. But this only improves performance, and makes no other difference.

The only remaining problem was the ISPs preferred IP setup, with netmask 255.255.255.255 and default gateway at 10.255.255.1. Most Linux servers won't accept this configuration out of the box (I'll write an article to make it work), and some firewall products like Microsoft Forefront Threat Management Gateway (TMG) won't even run with it. The hypervisor setup could be changed to give the virtual hosts proper IP addresses and netmasks (i.e. a netmask where the host address and the default gateway share the same subnet).

Virtualization disk storage concerns Posted on 01 May 2012 at 18:34

I am using a dedicated Linux server as a virtualization hypervisor. VirtualBox is a free, multi-platform type 2 user-mode hypervisor that works well with many different guest operating systems. One problem though was a less than acceptable disk IO performance in the guest. The server has a RAID5 of three off-the-shelf 7.200 rpm HDs, and should deliver around 100 MB/s with sequential reading. What I saw was erratic behavior, from high transfer rates (especially when reading non-allocated blocks) down to very low ones.

The default and pretty much standard method for creating a virtual hard disk is using a dynamic image. Dynamic images have a size attribute, and this is how much space they report to the guest OS. However, they start as small files and only get larger when the guest actually allocates, i.e. writes, a block. This creates a number of problems:

  • The disk image has to maintain a table or tree of allocated blocks, because allocations can occur anywhere at the beginning, the middle or the end of the emulated block device.
  • As an operating system is installed, the image will rapidly grow to more than 10 GB, but in small increments. In a real world file system, growing a large file in small chunks will inevitably lead to fragmentation of the file.
  • Having only small files leads to having a total of virtual hard disks much larger than the available physical space.

The first problem is partially solved by allocating large chunks and using an efficient algorithm to find the association between the emulated physical block device and the disk file. However, as this is implementation specific, we don't have a lot of control about how this works. It's basically a simple file system inside a file.

The second problem is much worse. Given several guest systems running in parallel, fragmentation will occur, especially on file systems with little space. Because the files can get pretty big, defragmentation is problematic, as it produces a lot of IO and requires a continuous free space slot the size of the disk file.

[root@hypervisor ROOT1]# xfs_bmap ROOT1.vdi ROOT1.vdi: 0: [0..59391]: 4280160..4339551 1: [59392..175103]: 4841920..4957631 2: [175104..401023]: 5656128..5882047 3: [401024..856831]: 39002944..39458751 4: [856832..1772415]: 50868288..51783871 [...] 226: [319714816..319991295]: 914358336..914634815 227: [319991296..322039295]: 916455488..918503487 228: [322039296..322094591]: 918552640..918607935 229: [322094592..323094015]: 920649792..921649215 230: [323094016..323804671]: 922746944..923457599 231: [323804672..324185511]: 924844096..925224935

To sum up what is happening when a guest tries to access a file: the guest has to access it's own file system trees to find the physical location of the file on the emulated disk. We'll simply assume that this incurs no overhead. Now that the guest knows the location, it will issue read commands to the emulated hard disk controller. The hypervisor now has to check it's internal entries in the virtual disk image file to locate the position where the requested data blocks are stored. Then the hypervisor file system has to locate the real physical position on the hard drive. As we know, a mechanical hard disk usually doesn't deliver more than 100 IOPS due to it's disk access times of about 10 ms. Under the right circumstances, even a sequential read by the guest, which would be pretty fast on a physical machine, will lead to dozens of different places being accessed.

Further problems arise if you try to mitigate the situation. You might be inclined to defragment the guest file system. But not only is this a very slow process, but because of the nature of the dynamic disk image, it won't lead to large files being in a contiguous space on the physical disk. You can also try to defragment the host file system, but still, this produces a high IO load, and due to the nature of the dynamic disk image, still no contiguous blocks where the guest would expect them.

The third problem doesn't seem like a problem at all. It is some kind of disk over-commitment usually deployed with RAM or with sparse files. However, there is a very distinct difference between memory over-commitment and dynamic images: unused space can never be reclaimed in dynamic images. That is, almost: images can be compacted. However, most file systems will only delete nodes in their file system structure, but will usually not touch the data itself. On the other side, the hypervisor doesn't know anything about the file system of the guest, so it cannot reclaim that unused space. In order for the host to be able to reclaim the space, the guest has to blank out data blocks it doesn't need anymore. This is usually done by first defragmenting on the guest, and then writing a big file full of zeroes until the disk is full. However, this is NOT recommended for production systems, as a full or nearly full hard disk is the worst nightmare for any kind of server, be it database, web or file server. At least shutdown services that might try to allocate memory from the disk. The whole process also has to be supervised the whole time and takes some time, especially if the dynamic disk has a large maximum size. With a virtual disk image, it will also lead to the image file growing to the maximum size on the host, causing further fragmentation on the host. A more intricate way would be to mount the virtual disks on the hypervisor or in a separate recovery OS and have a file system aware tool zero-out the unused space, however this requires the guest to be shut down. It would also be possible to do this online, but even the Microsoft tool sdelete simply chooses to fill up the disk with a useless file. Guest extensions could theoretically signal the unused space to the host, in the same fashion that TRIM signals unused space to SSDs. But never fill up your hard disks, especially not C:\ and especially not on a server, especially not a production system. After unused space has been cleared, the disk image can be compacted offline, meaning further downtime. The process is also very lengthy, but usually ends in a very compact image, bringing us back to where we started: an ever growing, fragmentation-prone file system in a file system in a file system. I will not document this process as it is documented elsewhere and in my opinion a big waste of time, just to save a few GBs that will get filled up again eventually.

First solution: fixed-sized images

To overcome the problems, I tried to convert the dynamic images to fixed-sized ones. You can use VBoxManage to convert between dynamic and fixed-size, however be advised that the process cannot easily be aborted, as VBoxSVC does all the work, and Ctrl-C'ing the VBoxManage-instance doesn't do anything at all.

VBoxManage clonehd [ old-fixed-VDI ] [ new-dynamic-VDI ] --variant Standard VBoxManage clonehd [ old-dynamic-VDI ] [ new-fixed-VDI ] --variant Fixed

This has to happen offline, so again, downtime. It's lengthy, it requires you to have at least as much free space available as the fixed-size image will take, and again leads to fragmentation. If you get the resulting fixed-size image down to an acceptable degree of fragmentation, no further fragmentation will occur, the overhead of the dynamic image will be gone, and there are good chances that contiguous blocks of data in the guest will be represented by contiguous blocks in the host file system, so unnecessary disk seeking will be avoided. In my experiments however, xfs_bmap reported hundreds of fragments for the fixed-size disk file, and IO performance still wasn't anywhere near I wanted it to be. To sum it up, it was just a large waste of time, and with large I mean several hours. You basically now have the worst of both worlds: an immensely large image file, still fragmentation, no way for over-commitment and still no real increase in performance, as each disk access in the guest still has to go through the host file system.

Second solution: enter LVM

As I had a lot of unused space, and the host was set up with LVM, I tried the raw disk approach. You basically copy the raw contents to a volume, and create a proxy image file which redirects to this raw volume. You first have to convert the disk image, dynamic or fixed-size doesn't matter, to a raw file:

VBoxManage internalcommands converttoraw file.vdi file.raw

Then you set up a new logical volume in LVM with at least the size of the raw file. After dd'ing the contents from the raw file over to the volume, the proxy image file is created:

dd if=file.raw of=/dev/mapper/vgXY-ABC VBoxManage internalcommands createrawvmdk -filename ~/rawdisk.vdi -rawdisk /dev/mapper/vgXY-ABC

There is one caveat: VirtualBox and it's instances usually run as non-root, so you need to chown/chgrp the block device representing your logical volume before this works. You also have to run all the above commands either as root or with sudo.

The result

As predicted, disk operations by the guest now much more behave like it would operate on a physical disk. You still have the option to grow the disk with LVM, with minimal fragmentation, and the erratic behavior is gone. Average read speads of 100 MB/s are normal. Instances boot faster and are more responsive with concurrent load.

As the disk is now hosted on an LVM volume, we can do some neat tricks. One technique to backup a virtual machine online is by using VirtualBox snapshots aka states. This technique is not recommended, because while the system itself might be in a consistent state, including the RAM content, it will not account for client connections. A restored instance could have incomplete or still-locked files and other issues. Dynamic images with multiple and possibly nested states/snapshots will introduce additional overhead. I also had situations where a saved state could not be restored due to changes in the host. Cold-starting the instance will then be like having pulled the plug while the machine was running, which can lead to data and/or file system corruption.

Strategies for redundancy and reliability are clustering and online backups – usually because large-scale backups cannot be restored in reasonable time. However, for a full backup, and if a short downtime can be accommodated, one way is to simply shutdown (or hibernate if you want to save the memory state) the guest, create a LVM volume snapshot, then copy the snapshot to your backup media, after which the snapshot can be removed. Creating a LVM snapshot while the guest is running is possible, but can lead to data and/or file system corruption if outstanding caches have not been flushed to disk.

Conclusion

IO performance could be significantly increased by using a raw LVM volume instead of a dynamic image. As the images already had grown to nearly their maximum size due to usage, not much space is wasted. However, the process was lengthy, so I advice to create production instances directly with raw volumes, and use dynamic images only for testing and when moving, duplicating, etc. is likely, as smaller files incur less time. Another side-effect is the fact that file system corruption in the host doesn't affect the guest volumes, as they don't exist as (fragmented) files anymore. The raw volumes usually can be restored with the LVM tool suite and then recreated on the same or a different machine, as long as the partition table and the LVM configuration is intact.

One problem still is the fact that after restarting the host computer, LVM will give the default root/root permissions to the block devices, ending in VirtualBox not being able to access them. I have yet to decide how to resolve this problem, either by having permanent ownership, by having a script changing ownership at boot time or by giving the vbox user account more rights. But because an unattended shutdown/reboot is currently not possible (it would require the hypervisor sending ACPI requests to the guest, then waiting for shutdown, which can take minutes, especially if all instances are to be shut down simultaneously, then allowing the host to shutdown, and then restart the instances when the machine comes back up), this isn't an issue now. I usually shutdown all instances manually, and then the hypervisor. A UPS for the hypervisor is advised because pulling the plug on XFS-based file systems with write cache enabled will usually lead to data loss.

Because drive space is cheap, I would at least stick with fixed-sized images or raw volumes for production systems. Both can be enlarged if really necessary, and current trends towards SANs, iSCSI and distributed file systems with transparent sparse file support make dynamic disk images redundant – usually because they substitute the feature, or because virtual machines can access external hard disk space more easily without actually affecting the disk image itself. The only niche where dynamic disk images seem to fit are on capacity-limited SSDs, where seeking time is not an issue any more. But then again, a grown disk image needs considerable space and time to reclaim unused space, so guest operating systems need to be either aware of the virtualization, or user-mode tools have to be developed to help online shrinking.