<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kanojo.de Blog &#187; computer</title>
	<atom:link href="http://blog.kanojo.de/tag/computer/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.kanojo.de</link>
	<description></description>
	<lastBuildDate>Fri, 06 Jan 2012 09:54:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Raid system &#8211; 8TB home storage, on budget!</title>
		<link>http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/</link>
		<comments>http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 20:36:22 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Computer]]></category>
		<category><![CDATA[Electronic]]></category>
		<category><![CDATA[Free-Software]]></category>
		<category><![CDATA[household]]></category>
		<category><![CDATA[Resources]]></category>
		<category><![CDATA[tinkering]]></category>
		<category><![CDATA[Tutorial]]></category>
		<category><![CDATA[10tb]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[free software]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[wood]]></category>

		<guid isPermaLink="false">http://kanojo.blogs.ghostdub.de/?p=2573</guid>
		<description><![CDATA[With the advent of higher definition Television, growing demand for high quality lossless audio as well as general madness the need for a reliable as well as flexible and large home storage solution grew rapidly for me. Just hammering more disks into your home router / server just won't nail it over the long term. [...]]]></description>
			<content:encoded><![CDATA[<p>With the advent of higher definition Television, growing demand for high quality lossless audio as well as general madness the need for a reliable as well as flexible and large home storage solution grew rapidly for me. Just hammering more disks into your home router / server just won't nail it over the long term. So i've set out to build a cheap (per TB), (hopefully) longlasting as well as reasonably reliable home storage system for the enthusiast (read: "tinkering geek"). This was achieved using a custom made case for the parts as well as a lucky find for the adapter card. Read on for more... <span id="more-2573"></span></p>
<h2>PLANNING AND CONSIDERATIONS</h2>
<p>So first when planning a system like this you need to decide a few things. What are the aspects that are most important to you?</p>
<ul>
<li>Speed?</li>
</ul>
<p>I've decided that this system will be a mass-storage. A data-grave. Thus i have made no attempts whatsoever to optimize speed in any way. The result is relatively clear - not really fast.</p>
<ul>
<li>Security?</li>
</ul>
<p>Does the data need special protection? For me it was simple - i've got a okay~ish CPU in the host box, so encrypting the whole data is no big problem. Why NOT do it?</p>
<ul>
<li>Reliability?</li>
</ul>
<p>How important is the data you want to save really? RAID is nice, but it is NO BACKUP. Are there special considerations to make for desaster recovery? As this is going to be a data-grave for me i've opted for a "best effort" policy. I've got a used UPS to protect my equipment (server and this storage) from power losses and voltage fluctuations. Then i've decided for a RAID system, namely the old RAID 5. As i said i haven't optmized for speed, so RAID 0+1 was no option since you'd need more disks to get the same actual storage size. As for desaster recovery i've decided to use a LVM layer on top of my crypto layer. This allows me for snapshots (in case of huge write events when snapshotting i think i can even use a external large disk as a snapshot device...) when i perform critical filesystem operations. If the filesystem ever fails so hard that i have to perform potentially dangerous operations (and FSCKing a "crashed" ext-filesystem IS dangerous) i can always just snapshot, toy around, if i break it revert the snapshot...</p>
<ul>
<li>Noise and vibration?</li>
</ul>
<p>As for vibration (and thus much of the noise of a harddisk array) you need to trade a few things off. On one hand its really nice to mount disks with thick rubber layers to dampen their vibrations. This has the backdraw that the harddisks will always move - they'll swing on the flexible rubber layer. Consider this article/video by Brendan Gregg to see why this is bad: <a href="http://blogs.oracle.com/brendan/entry/unusual_disk_latency">Unusual Disk Latency</a>. So basically you'd want something that is very rigid so the drives don't move *too* much. On the other hand, to isolate each harddrive from the vibrations of the others (huge commercial raid arrays all have/need ways to deal with isolating individual drives - consider a scenario where a RAID 5 or similar level array starts writing, many drives start seeking at the same time sending a huge vibration-wave through the case disturbing (and as i've heared - yes, only heared) and even headcrashing other drives.</p>
<p>I've attempt to achieve this by applying my knowledge from building speaker enclosures. The scenario as well as the vibration frequency range are quite similar. To supress case movements i've chosen a even (internally) and dense material: MDF. While this has the backdraw that it heavily couples the drives it will do a good job at dampening the overall vibration. As a upgrade to this (which i haven't implemented yet) one could use a additional "energy trap" consisting of a moveable, rubber-like but very stiff material. Fortuneatly this is readily available as a mat to put your washing machine on! If i'm not entirely mistaken even putting the whole construction on a piece of such a mat will not only effectively lower the (hearable) noise from the array but also further dampen vibrations by taking in the (slight) case movement and turning it into "heat" (by elastic deformation(?TODO: ASK PHYSICIST?)).</p>
<ul>
<li>Connection?</li>
</ul>
<p>How do you plan to connect the device to your computer? There are basically two ways of doing this: One is to use a cheap S-ATA adapter card with enough ports (see below) which will then leave you with a large pack of cable running from whereever you place the array to into your computer (also requiring some kind of large hole in your case to route the cables through). I've chosen this option as it fits my needs (server standing somewhere in a drawer) better. The other thing i've come across are sata multiplier cards. These nifty little things implement a integral but not widespread part of the sata-specification: some kind if hub/switch. If your card supports sata port multipliers (and really REALLY check if it does!) you can get a rather cheap (~60Eur maybe) multiplier card (e.g. from "dawicontrol") which enables you to build a completely self-contained storage - just plug one (e-)sata cable into your computer and zuuup, there are your 5 drives!</p>
<h2>CHOOSING PARTS</h2>
<p>Theres a bit of a philosophical question here regarding harddrives. You could either use 5 exactly same drives bought from the same store, thus having a larger probability to be from the same production batch. For one this would be great because seek and throughput performance would be the same, you wouldn't waste bandwidth waiting for one slower drive. On the other hand this can suck because ... if theres a manufacturing error and one drive dies, you replace it, start a rebuild, the rebuild puts heavy strain on the other drives, bearing the same error they'll be likely to die too leaving you with severe data loss. So i'd recommend you either to try to get the same drives from different batches, or - that was the way i did it - buy 5 completely different drives. As speed is a non-issue for me this seemed the safest choice when it comes to per-batch production errors.</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1689/" rel="attachment wp-att-2680"><img class="size-medium wp-image-2680 alignright" src="http://blog.kanojo.de/files/2011/10/IMG_1689-550x412.jpg" alt="" width="338" height="253" /></a></p>
<p>As for the adapter card i was very lucky to find leaf-computer.de who seem to offer quality storage controllers used on very simplistic PCBs - such as this: <a href="http://www.leaf-computer.de/raid-controller-8-port-sata-ii-pci-x.html" target="_blank">Marvel 8Port SATA Adapter</a>. Exaclty what i need! And at what a small price tag! The marvel chip used on this board is known for excellent linux driver support as well as good performance and a nice feature set. Heck you could even plug sata port multiplier on this and achieve a 8x 5-disk-raid-5! It does also support staggered spinup (spinning up one drive after another putting less load on the PSU)</p>
<p>As for the other components i've simply tried to choose reasonable at a good bang for the buck ratio. A simple 450W PSU with 85% efficiency even in the medium power range, a reasonably silent and powerful large fan... thats about it.</p>
<h2><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1690/" rel="attachment wp-att-2681"><img class="aligncenter size-medium wp-image-2681" src="http://blog.kanojo.de/files/2011/10/IMG_1690-550x412.jpg" alt="" width="550" height="412" /></a></h2>
<h2>IMPLEMENTATION (hardware)</h2>
<p>Taking the measurements at the drives and calculating the case sizes was a bit of a hassle, but in the end it worked nicely. Here you can see a the parts of the case loosely put on each other.</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1691/" rel="attachment wp-att-2682"><img class="aligncenter size-medium wp-image-2682" src="http://blog.kanojo.de/files/2011/10/IMG_1691-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1692/" rel="attachment wp-att-2683"><img class="aligncenter size-medium wp-image-2683" src="http://blog.kanojo.de/files/2011/10/IMG_1692-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>As for the parts needed - heres the list:</p>
<ul>
<li>1* 306x186mm</li>
<li>2* 102.2x186mm</li>
<li>1* 219x186mm</li>
<li>2* custom measures parts of ??x105.4mm</li>
</ul>
<p>everything is made of 16mm MDF.</p>
<p>Here you can then see how the PSU will be mounted (it has two mounting holes where that square timber is) and the two custom measured parts. They close the edges of the fan and depend on the outer sizes of your fan - so first mount/measure the fan, then cut the MDF.</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1693/" rel="attachment wp-att-2684"><img class="aligncenter size-medium wp-image-2684" src="http://blog.kanojo.de/files/2011/10/IMG_1693-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>The case itself is assembled by using simple wood glue, then reinforced (as seen below) by counterbored screws to make the vibration-transfer better and distribute it throughout the whole case.</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1694/" rel="attachment wp-att-2685"><img class="aligncenter size-medium wp-image-2685" src="http://blog.kanojo.de/files/2011/10/IMG_1694-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1695/" rel="attachment wp-att-2686"><img class="aligncenter size-medium wp-image-2686" src="http://blog.kanojo.de/files/2011/10/IMG_1695-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1696/" rel="attachment wp-att-2687"><img class="aligncenter size-medium wp-image-2687" src="http://blog.kanojo.de/files/2011/10/IMG_1696-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>Now comes the most tricky part, and i must admit that i've forgotten to make detailed notes of this. You need to measure the harddrives (they're more or less 25.4x146.5x101.6mm by specification) and where their screw-holes are, then measure it on the top of your case, drill a hole that'll just fit your screw, then drill a little with a larger drill to countersink their heads...</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1697/" rel="attachment wp-att-2688"><img class="aligncenter size-medium wp-image-2688" src="http://blog.kanojo.de/files/2011/10/IMG_1697-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1698/" rel="attachment wp-att-2689"><img class="aligncenter size-medium wp-image-2689" src="http://blog.kanojo.de/files/2011/10/IMG_1698-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>After finishing the holes i've found a nifty little gadget in my old hardware scrapbags - a simple fan control pcb! Woo! I've just mounted it where it looked nice - please note that this is totally optional <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> .</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1699/" rel="attachment wp-att-2690"><img class="aligncenter size-medium wp-image-2690" src="http://blog.kanojo.de/files/2011/10/IMG_1699-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1700/" rel="attachment wp-att-2691"><img class="aligncenter size-medium wp-image-2691" src="http://blog.kanojo.de/files/2011/10/IMG_1700-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>The next trick was to get the PSU to start and stay always running without attaching a mainboard or switch. Thanks to <a href="http://en.wikipedia.org/wiki/ATX#Power_supply" target="_blank">Wikipedias ATX Article</a> this was very simple. Find the PS_ON pin, some random ground (the one above/beneath PS_ON for example), rip them out of the poor connector and solder them together... magic works, the PSU runs!</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1704/" rel="attachment wp-att-2695"><img class="aligncenter size-medium wp-image-2695" src="http://blog.kanojo.de/files/2011/10/IMG_1704-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1705/" rel="attachment wp-att-2696"><img class="aligncenter size-medium wp-image-2696" src="http://blog.kanojo.de/files/2011/10/IMG_1705-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1706/" rel="attachment wp-att-2697"><img class="aligncenter size-medium wp-image-2697" src="http://blog.kanojo.de/files/2011/10/IMG_1706-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>Now its time to mount the harddisks. This turned out to be a little more difficult than expected as you need to place the drives exactly under the drilled holes without looking. In the design i proposed there is a little space between the drives making this a little harder yet. I've had luck with using a led flashlight on the backsides (the one with the PCB) of the HDDs and looking through the holes from the top of the case...</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1863/" rel="attachment wp-att-2698"><img class="aligncenter size-medium wp-image-2698" src="http://blog.kanojo.de/files/2011/10/IMG_1863-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>... and having a hand in the front of the case is also very handy <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1864/" rel="attachment wp-att-2699"><img class="aligncenter size-medium wp-image-2699" src="http://blog.kanojo.de/files/2011/10/IMG_1864-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>Then i've packed all SATA cables into a nice large batch (wow, this looks SO professional <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> )</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1865/" rel="attachment wp-att-2700"><img class="aligncenter size-medium wp-image-2700" src="http://blog.kanojo.de/files/2011/10/IMG_1865-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1866/" rel="attachment wp-att-2701"><img class="aligncenter size-medium wp-image-2701" src="http://blog.kanojo.de/files/2011/10/IMG_1866-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>and finally placed it where it belongs - in my drawer above my server!</p>
<p><a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1867/" rel="attachment wp-att-2702"><img class="aligncenter size-medium wp-image-2702" src="http://blog.kanojo.de/files/2011/10/IMG_1867-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1868/" rel="attachment wp-att-2703"><img class="aligncenter size-medium wp-image-2703" src="http://blog.kanojo.de/files/2011/10/IMG_1868-550x412.jpg" alt="" width="550" height="412" /></a>   <a href="http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/img_1869/" rel="attachment wp-att-2704"><img class="aligncenter size-medium wp-image-2704" src="http://blog.kanojo.de/files/2011/10/IMG_1869-550x412.jpg" alt="" width="550" height="412" /></a></p>
<p>After using it a few days i must say that both noise and heat is very good. Even with the fan on full i can barely hear the array even under full load (seek-hell). The drives report temperatures of 25-41*C through smart, which is really okay. When feeling with your hand it only barely warm...</p>
<p>&nbsp;</p>
<h2>IMPLEMENTATION (software)</h2>
<p>Now as the hardware is completed, up and running lets head for the software. This was a little harder than i expected to get right. As already said i've chosen a design like MD &lt;-&gt; CryptSetup (LUKS) &lt;-&gt; LVM &lt;-&gt; EXT4 to have maximum flexibility, data security as well as a little more help in case of a data corruption.</p>
<p>The main problem with the software setup was to get all those abstraction layers neatly aligned. However, first things first. If you got new drives, what do you do? Right - badblocks and smart. The thing with badblocks on a new drive is that it almost in all cases will NOT yield badblocks. This is because a modern harddrive has some free blocks that are not mapped at the start of its livetime. As the drives firmware finds bad blocks while operating it'll replace them with those replenishment-blocks. Only when you run out of those you're in serious trouble. However, as smart prints the "Reallocated_Sector_Ct" field you'll identify a defective harddrive after a exhaustive badblock test even if there are replacement blocks left.</p>
<p><em>badblocks -s -w /dev/sd[efghi]</em></p>
<p>Is what you want to run now (assuming dmesg agrees that your new drives are sde,f,g,h,i). Do expect it to run quite long, if you want the RW test as proposed around 2-3 days. After that we'd want to start a smart long selftest (s<em>martctl -t long /dev/sd[efghi]</em>) and then read and try to interpret the results (<em>smartctl -a /dev/sd[efghi]</em>).</p>
<p><img class="alignleft" src="http://www.animenation.net/blog/wp-content/uploads/2010/06/Dantalian_no_Shoka_1024_768.jpg" alt="" width="244" height="183" />The next thing is ... naming your array! Its important for such hardware to have a good name. I've chosen the name "dantalian" of the recent anime "Dantalian no Shoka" (<em><em>The Mystic Archives of Dantalian, ダンタリアンの書架). It</em></em>s basically that this <del>cute</del> tsundere little girl here to the left does contain a large library of magic books in her chest, making her a library (the magic library dantalian). And as its magic it is HUUUGE. What is this array? HUUUGE! Nice fit!</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>Anyways, back to the software setup. Now that the drives are tested we can start to initialize the array. This can be done with:</p>
<p><em>mdadm --create -x 0 -l 5 -n 5 dantalian /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1</em></p>
<p>after partitioning your drives so you'll have one partition of the type "FD", aka linux raid autodetect. For testing purposes i propose to create a smaller partition, e.g. 100gb, on each drive. That way you can run your various desaster tests (and i've run SOME!) without the raid rebuild taking ages each time!</p>
<p>Now that md is told of the array, we need to persist its configuration. This can neatly be done byletting mdadm itself create the config</p>
<p><em>mdadm --detail --scan &gt;&gt; /etc/mdadm/mdadm.conf</em></p>
<p>This assumes a standard debian mdadm.conf (especially containing "DEVICE partitions", telling mdadm to probe all partitions known to the system to find raid drives...). Now you can monitor your raids initial 'rebuild' status by issuing</p>
<p><em>watch -n 10 cat /proc/mdstat</em></p>
<p>which will yield something like:</p>
<p><em>Personalities : [raid6] [raid5] [raid4] md127 : active raid5 sdi1[5] sdh1[3] sdg1[2] sdf1[1] sde1[0] 7814041600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [UUUU_] [&amp;gt;....................] recovery = 1.4% (28754816/1953510400) finish=1590.0min speed=20174K/sec</em></p>
<p>Now, according to <a href="http://wiki.drewhess.com/wiki/Creating_an_encrypted_filesystem_on_a_partition#md_RAID_array" target="_blank">a good source</a></p>
<blockquote><p>If the device to be encrypted is an md RAID array, you should use the --align-payload= to ensure that crypto blocks are aligned on RAID stripes. This option takes as an argument the number of 512-byte sectors in a full RAID stripe. To calculate this value, multiply your RAID chunk size in bytes by the number of data disks in the array (N/2 for RAID 1, N-1 for RAID 5 and N-2 for RAID 6), and divide by 512 bytes per sector. In the example below, /dev/md0 is a RAID 6 device with 4 data disks and a stripe size of 128 kbytes: 128 * 1024 * 4 / 512 = 1024 sectors. # cryptsetup --verbose luksFormat --verify-passphrase --align-payload=1024 /dev/md0 When prompted, supply the key you created in the step above.</p></blockquote>
<p>So, adapted to our scenario that means that we'll create the crypt layer using:</p>
<p><em>cryptsetup --verbose luksFormat --verify-passphrase --align-payload=4096 /dev/md/dantalian</em></p>
<p>Now (after opening the device using <em>cryptsetup luksOpen /dev/md/dantalian dantalian-crypt</em>) lets create the LVM:</p>
<p>Consider this formula:</p>
<p><em>metaatasize = chunk size times number of data disks in the array</em></p>
<p>which then gives us the command to create the physical volume format on the crypto-pseudodevice:</p>
<p><em>pvcreate --metadatasize 2048K --dataalignment 4096 -M2 /dev/mapper/dantalian-crypt</em></p>
<p>now, using <em>pvscan </em>you can let this new volume be autodetected as:<em><br />
</em></p>
<p><em>PV /dev/dm-2 lvm2 [7.28 TiB] Total: 1 [7.28 TiB] / in use: 0 [0 ] / in no VG: 1 [7.28 TiB]</em></p>
<p>Okay, we've got our physical volume for LVM, so lets create one volume group containing one (or more, depends on your preferences here) logical volume on those:</p>
<p><em>vgcreate dantalian-vg-main /dev/mapper/dantalian-crypt </em></p>
<p><em>lvcreate --name dantalian-lv-main -l 100%FREE dantalian-vg-main</em></p>
<p>Now, after some more vgscan and lvscan you should be presented with your /dev/dantalian-vg-main/dantalian-lv-main device - ready to create a filesystem. This again needs to be .... ha - what is it ... ALIGNED! Hell No!</p>
<p>Lucky for us EXT4 pretty much autodetects that it is on a raid device, what stripe-width and stride-width are, but to be sure lets ask our source from before again:</p>
<blockquote><p>The relevant options for ext3 are stride and stripe-width. stride is identical to the md array chunk size, and stripe-width is identical to the array stripe width, except that both options are specified in units of filesystem blocks instead of bytes. The default ext3 (and ext4) block size is 4096 bytes, so simply divide your chunk size and stripe width by 4096 to get the proper values for these parameters. Here's an example using a RAID 6 array with 6 disks (i.e., 4 data disks) using a chunk size of 128k (stripe size is therefore 512 kbytes)</p></blockquote>
<p>So for us its now:</p>
<p><em>stride = chunksize / blocksize, (512*1024)/1024 = 128.0 </em></p>
<p><em>stripesize = stride * datadisks = 128 * 4 = 512</em></p>
<p>which will then result in the following mkfs command:</p>
<p><em>mkfs.ext4 -n -m 0 -E stride=128,stripe-width=512 -b 4096 /dev/dantalian-vg-main/dantalian-lv-main</em></p>
<p>leaving us (when removing the -n option) with a nicely aligned filesystem that is aware of the striping/striding. The last point is then again a data-security thing. If ever the unlikely case should happen that the crypt layers metadata get corrupted it is really handy to have a backup. This can be achieved by</p>
<p><em>cryptsetup luksHeaderBackup --header-backup-file DANTALIAN-LUKS-HEADERS /dev/md/dantalian </em></p>
<p><em>cryptsetup luksDump /dev/md/dantalian &amp;gt; DANTALIAN-LUKS-HEADER-DUMP</em></p>
<p>with all its backdraws. Those can be found in the cryptsetup/luks <a href="http://code.google.com/p/cryptsetup/wiki/FrequentlyAskedQuestions#6._Backup_and_Data_Recovery" target="_blank">FAQ</a>!</p>
<p>&nbsp;</p>
<p>Now after some while of using the array i'm quite fond of it. Its not only faster than i've feared (i do get 25-30MB/s over my gbit lan) its also proven to be quite sturdy in case of bad(TM) events. I've tried killing the power of the array, the host computer, tried unplugging single harddrives or even more than one harddrive - all even with heavy random access (compiling a kernel). I haven't succeeded in creating a situation that would've lead to data-loss.</p>
<p>Again, if you're interested in more details, would like to know the exact locations of the holes you need to drill for the harddisks or something else - just write a email or leave a comment!</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2011%2F10%2F11%2Fraid-system-8tb-home-storage-on-budget%2F&amp;title=Raid%20system%20%26%238211%3B%208TB%20home%20storage%2C%20on%20budget%21" id="wpa2a_2"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2011/10/11/raid-system-8tb-home-storage-on-budget/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Deer me, Deer you, Bookstand-Deer!</title>
		<link>http://blog.kanojo.de/2010/04/25/deer-me-deer-you-bookstand-deer/</link>
		<comments>http://blog.kanojo.de/2010/04/25/deer-me-deer-you-bookstand-deer/#comments</comments>
		<pubDate>Sun, 25 Apr 2010 15:55:27 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Electronic]]></category>
		<category><![CDATA[household]]></category>
		<category><![CDATA[tinkering]]></category>
		<category><![CDATA[Tutorial]]></category>
		<category><![CDATA[birthday present]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[deer]]></category>
		<category><![CDATA[nerdism]]></category>

		<guid isPermaLink="false">http://blog.kanojo.de/?p=640</guid>
		<description><![CDATA[And off we go for another nice DIY tinkering howto. Again we needed a birthday present for a friend of ours, which is a deliberate Otaku and IRC-Nerd on #satf, rizon. Those folks happen to have a bot that can draw ASCII-Art pictures mainly of deers - that look just like the bookholder deer below. [...]]]></description>
			<content:encoded><![CDATA[<p>And off we go for another nice DIY tinkering howto. Again we needed a birthday present for a friend of ours, which is a deliberate Otaku and IRC-Nerd on #satf, rizon. Those folks happen to have a bot that can draw ASCII-Art pictures mainly of deers - that look just like the bookholder deer below.</p>
<p><a rel="attachment wp-att-659" href="http://blog.kanojo.de/?attachment_id=659"><img class="aligncenter size-medium wp-image-659" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7762-366x550.jpg" alt="" width="366" height="550" /></a></p>
<p>As you can imagine our friend was flabbergasted (i somehow like that word) to the last. Read on for more details on how to build it!</p>
<p><span id="more-640"></span></p>
<p>So, there we are for detailed build instructions. As for materials and tools you'll need ... this could be somehow hard to get depending on where you live. We're lucky to have a *good* shop for metall stuff around, so we had a cheap go on a remaining stock Aluminum-sheet-metal. We went for quite thick material as we wanted to bend it, 1.8mm for reference. That descision might sound a bit too much, but it simply looks better, more sturdy and has less risk of breaking when bending it into the final form.</p>
<p>Aside of that you'll need a printer, a hobby-knife for cutting out the template and some transparent film for protecting the metal as well as pinning the template to the metal. For cutting out we used a Dremel (not original, but by Proxxon) and for the final finishing you'll need some fine/small files.</p>
<p>If you want to give the Aluminum a brushed look you'd need a thick (and straight) piece of wood, a F-clamp as well as some 600-sandpaper.</p>
<p>Again, only IF you want to seal it - we recommend a airbrush/spraygin and proper clearcoat, we used something slightly different - worked nice nonetheless.</p>
<p>Okay, off we go for the actual work. Print out your template, cut out the outlines with your hobbyknife and pin in to the sheetmetal. Then put the protective mask/film over it to protect the metal from scraches and hold the template properly.</p>
<div id="attachment_641" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-641" href="http://blog.kanojo.de/?attachment_id=641"><img class="size-medium wp-image-641" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7736-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">What you&#039;ll need - and how to use <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p></div>
<p>Start cutting out the outlines very slowly (the thick material may take you a long time...)</p>
<div id="attachment_642" class="wp-caption aligncenter" style="width: 376px"><a rel="attachment wp-att-642" href="http://blog.kanojo.de/?attachment_id=642"><img class="size-medium wp-image-642" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7738-366x550.jpg" alt="" width="366" height="550" /></a><p class="wp-caption-text">Slowly (any normal dremel won&#039;t allow you anything else) cut along your mark. Be carefull not to make your cut too long for &quot;inside&quot; corners - so you don&#039;t harm the actual shape.</p></div>
<div id="attachment_643" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-643" href="http://blog.kanojo.de/?attachment_id=643"><img class="size-medium wp-image-643" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7739-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Remember to flip your sheet metal from time to time - the backside is wonderful for refining the edges - especially for &quot;inside corners&quot;</p></div>
<p>As a little note: Really really be careful not to grind too far at "inside corners", it'll look really bad. A nice fix for "too small" edges is to turn over the sheetmetal and cut a bit from the backside, then turn to front again, and so on. Also, if one of your grinding discs got too small - save it for those jobs! <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<div id="attachment_644" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-644" href="http://blog.kanojo.de/?attachment_id=644"><img class="size-medium wp-image-644" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7741-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">The first (easy) piece comes off ... huge success!</p></div>
<div id="attachment_645" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-645" href="http://blog.kanojo.de/?attachment_id=645"><img class="size-medium wp-image-645" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7742-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Back view of the next, more complicated cut</p></div>
<div id="attachment_647" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-647" href="http://blog.kanojo.de/?attachment_id=647"><img class="size-medium wp-image-647" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7747-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Skipped forward a bit. After cutting you want a good fine and small file to smoothen up the edges and corners. Remember to keep the protection layer so you don&#039;t make any unwanted scratches - which happens easily with files.</p></div>
<div id="attachment_648" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-648" href="http://blog.kanojo.de/?attachment_id=648"><img class="size-medium wp-image-648" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7749-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Detail of the head-section - this was the hardest cutting-work.</p></div>
<p>Okay, so for the brushing. First off - this is a picture of our first try, please note that it is a bad idea to first bend the deer, then brush it. But as for the general method - you need a straight piece of wood (here: the triangle) and push your sheetmetal against it so to straighten it. Then take another (smaller) piece of wood and wrap the sandpaper around it. If you now grind using the triangle as a guidance so all your "streaks" and scratches are exactly in the same direction you'll get that nice brushed look. You'll need to grind forth and back about 4-5 times, if you want a finer brushed look more often using a finer (800) sandpaper.</p>
<div id="attachment_649" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-649" href="http://blog.kanojo.de/?attachment_id=649"><img class="size-medium wp-image-649" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7750-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Brushing *after* bending is no nice idea <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> . Remember to do it before.</p></div>
<div id="attachment_650" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-650" href="http://blog.kanojo.de/?attachment_id=650"><img class="size-medium wp-image-650" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7752-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Details of the brushed look.</p></div>
<div id="attachment_646" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-646" href="http://blog.kanojo.de/?attachment_id=646"><img class="size-medium wp-image-646" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7745-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Yay, finished (almost)!</p></div>
<div id="attachment_651" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-651" href="http://blog.kanojo.de/?attachment_id=651"><img class="size-medium wp-image-651" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7753-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Detail before cleaning/sealing.</p></div>
<p><a rel="attachment wp-att-652" href="http://blog.kanojo.de/?attachment_id=652"><img class="aligncenter size-medium wp-image-652" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7754-550x366.jpg" alt="" width="550" height="366" /></a></p>
<div id="attachment_653" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-653" href="http://blog.kanojo.de/?attachment_id=653"><img class="size-medium wp-image-653" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7756-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">The two finished deers (uncleaned/sealed)</p></div>
<div id="attachment_654" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-654" href="http://blog.kanojo.de/?attachment_id=654"><img class="size-medium wp-image-654" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7758-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">For sealing we&#039;ll use our airbrush (0.2mm nozzle)...</p></div>
<div id="attachment_656" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-656" href="http://blog.kanojo.de/?attachment_id=656"><img class="size-medium wp-image-656" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7759-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">... and lascaux transparent matte varnish/sealer. Wasn&#039;t made for this usecase, works like a charm nontheless. Remember to use this only thinned at least 1:1.</p></div>
<div id="attachment_657" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-657" href="http://blog.kanojo.de/?attachment_id=657"><img class="size-medium wp-image-657" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7760-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">After spraying quite some of that stuff on the two deers they need to dry at least 30minutes.</p></div>
<div id="attachment_658" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-658" href="http://blog.kanojo.de/?attachment_id=658"><img class="size-medium wp-image-658" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7761-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">And finished they are - holding our stack of University and Nerd books <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p></div>
<p><a rel="attachment wp-att-659" href="http://blog.kanojo.de/?attachment_id=659"><img class="aligncenter size-medium wp-image-659" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7762-366x550.jpg" alt="" width="366" height="550" /></a></p>
<div id="attachment_660" class="wp-caption aligncenter" style="width: 560px"><a rel="attachment wp-att-660" href="http://blog.kanojo.de/?attachment_id=660"><img class="size-medium wp-image-660" src="http://blog.kanojo.de/files/2010/04/20100410-IMG_7764-550x366.jpg" alt="" width="550" height="366" /></a><p class="wp-caption-text">Check out that DIY brushed alu - nice, isn&#039;t it?</p></div>
<p>Nice, isn't it? You may consider a different cutting technique for the rough outlines as normal dremels take *ages* (litterally, we've been grinding and cutting almost 16hours for those two deers here). Nonetheless this brithday-present was a rock-on gift, and the ability to custom-make any pixel-art into a bookstand is *THE* idea for a decent gift.</p>
<p>I hope you enjoyed this a bit picture-centered tutorial. If you have any questions, feel free to use the comment system, i'll answer as soon as i can.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2010%2F04%2F25%2Fdeer-me-deer-you-bookstand-deer%2F&amp;title=Deer%20me%2C%20Deer%20you%2C%20Bookstand-Deer%21" id="wpa2a_4"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2010/04/25/deer-me-deer-you-bookstand-deer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mahjong ruleset, TeXed</title>
		<link>http://blog.kanojo.de/2010/01/02/mahjong-ruleset-texed/</link>
		<comments>http://blog.kanojo.de/2010/01/02/mahjong-ruleset-texed/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 17:45:39 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Mahjong]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[Non-Tut]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://blog.kanojo.de/?p=268</guid>
		<description><![CDATA[As we've been playing more and more Mahjong (not the solitair version, the real thing) recently and just stumbeled upon #mahjong in rizon where we got linked a really really nice ruleset, here i though ... well, wouldn't it be nice to have this as a booklet printout so you can check the rules or [...]]]></description>
			<content:encoded><![CDATA[<p>As we've been playing more and more <a title="Mahjong" href="http://en.wikipedia.org/wiki/Japanese_Mahjong">Mahjong</a> (not the solitair version, the real thing) recently and just stumbeled upon #mahjong in rizon where we got linked a really really nice <a href="http://tmp.kanojo.de/rules2up.ps">ru</a>leset, <a title="here" href="http://www.ofb.net/~whuang/ugcs/gp/mahjong/mahjong.html">here</a> i though ... well, wouldn't it be nice to have this as a booklet printout so you can check the rules or yaku right at the table if you're unsure.</p>
<p>Okay, so after almost two days of fighting with XeLaTeX to get nice unicode support and fighting defoma for getting a nice font its finally done!</p>
<p>You can fetch the <a href="http://tmp.kanojo.de/rules.pdf">PDF</a> here, the booklet printing (just print it, fold the whole stack in the middle (short-edge oriented) and <a href="http://www.youtube.com/watch?v=pgD1cNiVLSM">staple</a> it together) version as postscript is available <a href="http://tmp.kanojo.de/rules2up.ps">here</a>.</p>
<p>Also note that there might be a few mistakes due to the hardcore TeX action in the typesetting, feel free to report those to me to get 'em fixed. For errors in the original document please contact the original author or me.</p>
<p>I hope you're having your fun playing with those rule sheets, i hope they came out nicely <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> . For a nice, short yaku overview just surf up <a href="http://www.osamuko.com/2009/12/20/yaku-overview-pdf/">here</a>.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2010%2F01%2F02%2Fmahjong-ruleset-texed%2F&amp;title=Mahjong%20ruleset%2C%20TeXed" id="wpa2a_6"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2010/01/02/mahjong-ruleset-texed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Robust HTML Parsing (in Ruby)?</title>
		<link>http://blog.kanojo.de/2010/01/01/robust-html-parsing-in-ruby/</link>
		<comments>http://blog.kanojo.de/2010/01/01/robust-html-parsing-in-ruby/#comments</comments>
		<pubDate>Fri, 01 Jan 2010 18:12:33 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Electronic]]></category>
		<category><![CDATA[tinkering]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Techniques]]></category>

		<guid isPermaLink="false">http://blog.kanojo.de/?p=227</guid>
		<description><![CDATA[Have you ever wanted to parse information from some rather complex or totally broken (in terms of html standards compliance) website? Maybe you tried fighting that problem with regular expressions or DOM or SAX XML parser. If you did you probably ran into some problems: Maybe there were too many similar matches for your regex [...]]]></description>
			<content:encoded><![CDATA[<p>Have you ever wanted to parse information from some rather complex or totally broken (in terms of html standards compliance) website? Maybe you tried fighting that problem with regular expressions or DOM or SAX XML parser. If you did you probably ran into some problems: Maybe there were too many similar matches for your regex as there are repeating similar patterns in the website or your XML parser went crazy with invalid formatted or non-xhtml-compliant content?</p>
<p>I wanted to parse a website that had no RSS feed for changes and create a RSS feed. I first tried around with various of the ideas mentioned above but as the website is kind of "irregular" (every item is a slight bit different) and W3 validator shows over 11k of errors (in 1.1 transitional) i had quite some problems.</p>
<p>Until i found Rubies Hpricot, a HTML parser that lets you realize robust HTML parsing of fucked up formatted and non-standard-compliant content at ease.</p>
<p><span id="more-227"></span></p>
<p>Hpricot is quite simple to use. The basic idea of the parsing part is that you specify the tag-order you want to walk down in the tree. So maybe you want the content of a div inside a td inside a tr inside a table inside a table inside a div ... you get the idea. By the way, <a title="Firebug" href="http://getfirebug.org">Firebug</a> is extremely useful for finding the structures you need in the HTML tree hirarchy. Hpricot will walk you down all paths that match your criteria and return you the rest portion of the tree found down there:<br />
require 'hpricot'<br />
require 'open-uri'<br />
overview = Hpricot(open("INSERT SOME URL HERE"))<br />
prodno = 0<br />
(overview/"table").each do |product|<br />
  if product.attributes=={} and not product.to_s.include? "closed on the days marked"<br />
    (product/"tr/td/table").each do |article|<br />
      prodno += 1<br />
    end<br />
  end<br />
end<br />
This is a small snippet of my parser code, it will open the url, fetch the content and create a Hpricot parser object, then for every table check whether its the table we search for (identified by the attributes and content text). Then it will count every item (in a 2-column table).</p>
<p>As you see the basic idea is quite simple, fetch element by tree position, identify the element with no doubts, do the actual magic.</p>
<p>For the actual magic point Hpricot also helps you alot! Things like<br />
linkurl = (content/"a")[0].attributes['href'].to_s.gsub("\r\n","")<br />
imgurl = (img/"img")[0].attributes['src'].to_s.gsub("\r\n","")<br />
name = (content/"a")[0].inner_text.to_s.gsub("\r\n","")<br />
are so simple (attributes seems self-explanatory, inner_text extracts only the text, not the tags that are children of the element you call it on).</p>
<p>A slightly more sophisticated example would be this:<br />
      detail = Hpricot(open(linkurl))</p>
<p>      price = nil<br />
      stock = nil<br />
      sale = nil<br />
      (detail/"table//tr//td//table").each do |data|<br />
        if data.to_s.include? "can buy from here"<br />
          entries = (data/"tr")<br />
          entries.delete_at(0)<br />
          entries.each do |entry|<br />
            estr = entry.to_s.downcase<br />
            if estr.include? "sale price"<br />
              price = (entry/"td")[1].inner_text.to_s.gsub("\r\n","")<br />
            elsif estr.include? "sale status"<br />
              sale = (entry/"td")[1].inner_text.to_s.gsub("\r\n","")<br />
            elsif estr.include? "stock status"<br />
              stock = (entry/"td")[1].inner_text.to_s.gsub("\r\n","")<br />
            end<br />
          end<br />
        end<br />
      end<br />
Here a site like <a title="this" href="http://www.amiami.com/shop/?set=english&amp;vgForm=ProductInfo&amp;sku=FIG-MOE-0559&amp;template=default/product/e_display.html">this</a> is parsed for information like stock status, sale status and price tag.</p>
<p>As you see this is a more robust approach compared to python string.split, XML DOM/SAX which doesn't work for non-standard sites. It's not as perfect as i would wish for "easy" html parsing, but its better than everything i've seen so far.</p>
<p>Also i'll post the script for parsing amiami.com for changes later (beware, no nice ruby code, hacked late at night in 1h <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> ) so you can see a more elaborate example. I hope you'll have more fun parsing HTML using these Hpricot and these snippets <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> .</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2010%2F01%2F01%2Frobust-html-parsing-in-ruby%2F&amp;title=Robust%20HTML%20Parsing%20%28in%20Ruby%29%3F" id="wpa2a_8"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2010/01/01/robust-html-parsing-in-ruby/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The pain of being independent</title>
		<link>http://blog.kanojo.de/2009/12/07/the-pain-of-being-independent/</link>
		<comments>http://blog.kanojo.de/2009/12/07/the-pain-of-being-independent/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 21:30:57 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Electronic]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Non-Tut]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[server]]></category>

		<guid isPermaLink="false">http://blog.kanojo.de/?p=95</guid>
		<description><![CDATA[I've spend a whole lot of the time of the last few days tinkering on various parts of my root server. With the time passing you get used to the comfort of various (web based) tools such as GMail, Google-Calendar, Google Reader, etc. You may notice that you just read the word "Google" quite often, [...]]]></description>
			<content:encoded><![CDATA[<p>I've spend a whole lot of the time of the last few days tinkering on various parts of my root server. With the time passing you get used to the comfort of various (web based) tools such as GMail, Google-Calendar, Google Reader, etc. You may notice that you just read the word "Google" quite often, so what pops into your mind? Right, privacy. Google kind of mines quite some of your data. Ever checked the Ads google shows you on various sites (given you don't use a capable Ad Blocker)? Sometimes it gets quite creepy. That data is quite valueable for profiling your behavior, and that profile is (not related to your persona, but in general) sold to marketing monkeys.</p>
<p>So, that's why you might want to rebuild all those Tools you're used to in a trusted environment - your own server. You don't want your mail stored in some possible hostile environment on a untrusted machine that could leak your valuable data. Turns out its not that easy sometimes. I've only worked on getting a capable Web Mailer and Feed Reader to run smoothly. What surprised me here - and why I'm writing about this - is that it was exceptionally hard - or rather time-intensive - for something sounding as easy as this.</p>
<p><img class="size-full wp-image-99 alignleft" src="http://blog.kanojo.de/files/2009/12/google-privacy.gif" alt="google-privacy" width="250" height="202" /></p>
<p>I first targeted the Reader, looked around Freshmeat and SourceForge where you expect to find decent free software for that task. I've found quite a few not-so-simple-looking projects, including Tiny Tiny RSS. Turned out TinyTiny RSS is almost there, but needs PostgreSQL. So  i set up PostgreSQL, installed TTRSS and set it up. Imported GoogleReaders OPML, and zup. worked. Problem's all the feeds are in one big Table - resulting in the whole thing beeing painfully slow. So up for 7-8 Hours of harcore Postgres performance tuning, trying to hack memcached into ttrss, etc. Speedup of almost 100%, yet it was not close to beeing usable. Turns out the developer didn't intent the project for archiving articles for having a searchable archive. So up for something different, Gregarius, which the TTRSS dev recommended. This worked quite out of the box, except for 3-4 Hours of tinkering and writing small plugins to get the whole thing to work properly. But it does - and has almost all the feature one expects.</p>
<p>The harder part came now, Webmail. I first tried to hack the roughly-set-up RoundCube that still was on my server. After short testing and many functions that just did not work due to unknown reasons i knew i needed something different - and started toying around with Horde and its webmailer Imp(4) in the Horde Webmail Edition pack. Horde feels somehow "unix style"-ish, like ... building a highly reusable backend, letting other projects include/work ontop of that backend, etc. - but let me say one thing: This beast is so darn hard to set up. I've got it working - more or less - after hours and hours of doc reading, tinkering around with mysql tables and databases and reading Horde source due to its ... slightly lacking ... documentation on some points. Still, it was so unstable and lacked features too. So i finally decided Horde/IMP was a bit too much to go with. After searching around its a lets-get-back-to-Roundcube.</p>
<p><img class="size-medium wp-image-101 alignright" src="http://blog.kanojo.de/files/2009/12/fB2tdAiwKpnaovqajbieqTQCo1_500-449x550.png" alt="GoogleXKCD" width="300" height="367" /></p>
<p>Just that this didn't make things better, well... a bit at least. It takes tinkering, fixing old plugins to work with the current version, finding out why the hell buttons are greyed out that shouldn't be and getting a "well, reset the database (again)" from the developers. All of that fun. As of now i at least managed to get everything except for Filter-Rules and sa-learning ham to work. Phew!</p>
<p>Okay, so why am i writing all this? you may ask yourself. Well, i've gone through some PITA for beeing independent. It really takes some work to get everything to run smoothly if you're used to professional systems that are customly coded (such as the google stuff) and backed by real money its still some hackish tinkering to get the Free Software tools that we're given by the community (which i'm not ranting against by the way, all that code out there is really beautiful in fact) to the same level. I also want to encourage everyone out there not to give away their data but to build something on their own, keeping their data. As computer users used to for a good reason for a long long time.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2009%2F12%2F07%2Fthe-pain-of-being-independent%2F&amp;title=The%20pain%20of%20being%20independent" id="wpa2a_10"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2009/12/07/the-pain-of-being-independent/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GNU/Linux  iPhone Sync &#8211; Wireless! Funambol error -1, yay!</title>
		<link>http://blog.kanojo.de/2009/10/30/gnulinux-iphone-sync-wireless-funambol-error-1-yay/</link>
		<comments>http://blog.kanojo.de/2009/10/30/gnulinux-iphone-sync-wireless-funambol-error-1-yay/#comments</comments>
		<pubDate>Fri, 30 Oct 2009 14:51:09 +0000</pubDate>
		<dc:creator>nebuk</dc:creator>
				<category><![CDATA[Electronic]]></category>
		<category><![CDATA[administration]]></category>
		<category><![CDATA[bugfix]]></category>
		<category><![CDATA[computer]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[tinkering]]></category>

		<guid isPermaLink="false">http://blog.kanojo.de/?p=52</guid>
		<description><![CDATA[I recently got me a iPhone for tinkering, development (i've got a few nice ideas) and general nerdism. I've run into a few problems syncing my PIM (stuff like Contacts, Tasks, etc.) - especially since i use GNU/Linux which is no platform to run iTunes. Pictures and Music is no Problem as gtkpod and the [...]]]></description>
			<content:encoded><![CDATA[<p>I recently got me a iPhone for tinkering, development (i've got a few nice ideas) and general nerdism. I've run into a few problems syncing my PIM (stuff like Contacts, Tasks, etc.) - especially since i use GNU/Linux which is no platform to run iTunes. Pictures and Music is no Problem as gtkpod and the like support the iPhone. Just the important stuff does not work out nicely.</p>
<p><span id="more-52"></span></p>
<p>So, after a *ton* of googeling and looking around i found out theres a app for jailbroken iPhones which reads the .sqlitedb files of Contacts, Calendar and Notes and sends it to a specialized synchronization server for PIM. Luckily theres a bunch of clients for other platforms - including S60, evolution, ldap, etc. - we're interested in the Evolution one.</p>
<p>So, i installed Funambol (the syncserver) on my server and started it up. First to notice is that the startup often does not work at first try - i had to start/stop the server a few times before it would serve me on http://hostname:8080/funambol/. Also there seems to be a fucked-up state where the http-server is running but the admin-interface won't let you log in. At that point i also had to restart the server a few times *sigh*.<br />
Then i installed iPhoneSync, the iPhone funambol client on my handset, started it up, entered my servers data and synced. Woosh, failed. The Log said "server returned error -1" ... wowziez, cool - whats that? No table of error codes, no description, no server debug log entries, nothing - not even a bugreport in some mailinglist. Basically its a funambol bug concerning contact-Thumbnails with more than 8k size *Sigh*. At least its fixed in svn trunk...</p>
<p>Now comes the fun part - compiling Funambol. As Funambol needs tomcat, netbeans, maven2 and a whole other bunch of dependencies its *really* fun. With some tinkering i made it to a installer package - that now even works with the iPhone.<br />
So, for all of you out there running into the same bug as me - heres a Funambol binary of svn trunk with this bug fixed:</p>
<p><a title="funambol-8.2.2-SNAPSHOT.bin" href="http://tmp.kanojo.de/funambol-8.2.2-SNAPSHOT.bin">http://tmp.kanojo.de/funambol-8.2.2-SNAPSHOT.bin</a></p>
<p>So - here's what you basically need to do:</p>
<ol>
<li>scp the above binary on your server, ssh there and execute "sh <a title="funambol-8.2.2-SNAPSHOT.bin" href="http://tmp.kanojo.de/funambol-8.2.2-SNAPSHOT.bin">funambol-8.2.2-SNAPSHOT.bin</a>"</li>
<li>install the server as guided by the installation program, i installed it to /opt. Also note it *really* only installs stuff there, so no worries about trash not managed by the package management on your system <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .</li>
<li>when asked whether you want to start the server say no</li>
<li>cd to /&lt;wherever&gt;/Funambol/, execute "./bin/funambol-server start", wait a few minutes (you can also check top whether java has stopped eating up your whole cpu-power)</li>
<li>start up the admin interface, either on the server by using remote X (ssh -X) "./admin/bin/funamboladmin" or in a installation of funambol on your current desktop box.</li>
<li>there, log in using "admin" as user and "sa" as password, go to user management, change the admin pw (important! <img src='http://blog.kanojo.de/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> ) and log in again.</li>
<li>once again in user management create a new user for you.</li>
<li>on your jailbroken iPhone with cydia, install "iphonesync" via cydia, start it up, enter your servers data (http://yourhostname:8080/funambol/ds) - and sync!</li>
<li>on your desktop box, install syncevolution, to configure do the following:</li>
<li>execute "syncevolution --configure --sync-property "username=123456" --sync-property "password=!@#ABcd1234" funambol". then open up ~/.config/syncevolution/funambol/config.ini and change the URI to the same as you used on your iPhone</li>
<li>sync using syncevolution -s &lt;mode&gt; funambol (&lt;mode&gt; can be one of the items seen in "syncevolution -s ?")</li>
</ol>
<p>Happy Syncing!</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fblog.kanojo.de%2F2009%2F10%2F30%2Fgnulinux-iphone-sync-wireless-funambol-error-1-yay%2F&amp;title=GNU%2FLinux%20%20iPhone%20Sync%20%26%238211%3B%20Wireless%21%20Funambol%20error%20-1%2C%20yay%21" id="wpa2a_12"><img src="http://kanojo.blogs.ghostdub.de/wp-content/plugins/add-to-any/share_save_120_16.png" width="120" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.kanojo.de/2009/10/30/gnulinux-iphone-sync-wireless-funambol-error-1-yay/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

