Jan Harasym

Designing highly scalable/resilient infrastructure by day; running hacker communities by night.

Page 3


Reminder: Keep your IRC Email up to date

If you’ve registered with NickServ on darkscience within the last few years then you’ll have used an email address and we’ll have sent you a mail to verify it. That will probably be the last time you heard from us…

…until you forget your password and find yourself unable to identify to your account. When that happens we can send an email (only to that same address) to verify your identify and reset your password.

You aren’t stuck with the email you originally used though! I’d very strongly recommend you take 5 minutes to double check the set email address is current, especially in light of recent service closures. You don’t need access to your old inbox to change your registered email, just your NickServ password.

To view the current state of your account, while identified type:

/msg nickserv info

If you’d like to then change the registered email address, first…

/msg nickserv set...

Continue reading →


Elasticsearch Notes

Elasticsearch is 2 components.

  • Elasticsearch: clustering engine and REST API
  • Lucene: Search backend. (indexes are always raw lucene)

You need to understand how both work;

Lucene:

Index Merges

This video displays how index merges occur:
Indexing Mediawiki

Basically when you have enough segments that can be grouped they will be vacuumed and merged.

Source

Memory Pressure/Heap:

If you monitor the total memory used on the JVM you will typically see a sawtooth pattern where the memory usage steadily increases and then drops suddenly.

Sawtooth

Sawtooth
The reason for this sawtooth pattern is that the JVM continously needs to allocate memory on the heap as new objects are created as a part of the normal program execution. Most of these objects are however short lived and quickly become available for collection by the garbage collector. When the garbage collector finishes you’ll see a drop on the memory...

Continue reading →


Tor @ DarkScience

For completeness I’m going to write this post as if you know nothing about me or Tor. If you know about DarkScience and Tor you can safely skip to here.

Tor

Tor, or more formerly “The Onion Router” is a method of decentralised VPN, it’s more commonly referred to as a “Anonymity network” and that would be very apt. The design of Tor is an encrypted mesh network operating over the top of the regular internet, there are intermediaries which have no concept of the data, where it came from or where it’s headed- only it’s next hop in the chain. There are entry points who only know where the data came from but not where it’s going or what it is, and more famously there are exits which take traffic originating from somewhere in the tor network and allow it to enter the public internet.

Tor, typically is rather slow, and it’s possible to address servers and services without ever leaving tor...

Continue reading →


Please stop advocating wildcard certificates.

Ever since I started “doing the computer stuff” I’ve been a bit wary of SSL/TLS.
It’s very easy to get wrong and revocation is not a solved problem no matter which security vendor is trying to push for sales.

There is, however, a strong urge to do things as easily as possible. For most people using SSL/TLS this becomes:

  • Use openssl commands from a 5 year old blog to generate a CSR.
  • Paste your CSR onto some web form.
  • Copy/Paste ciphers/config from some other website.
  • Ensure usage of wildcard certs. so you never have to do this pain again.

Some of these steps are worse than others, there’s almost no risk of using old openssl commands (except smaller keysize), but if you have an old list of cipher suites you’ll probably be using a deprecated cipher- and your clients data may be vulnerable in transit.

The worst one by far is the wildcard certificate; and this is for mostly obvious...

Continue reading →


Dropping filesystem caches.

By writing to /proc/sys/vm/drop_caches the linux kernel will drop clean caches, dentries and inodes from memory, causing that memory to become free. (yay!)

To free pagecache:

echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes:

echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes:

echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation, and dirty objects are not freeable, the user should run “sync” first in order to make sure all cached objects are freed.

This tunable was added in 2.6.16.

It’s pretty handy when trying to benchmark disks.

Sidenote: on FreeBSD this is impossible, you have to unmount the disk and remount it.

View →


How to write good.

  1. Avoid Alliteration. Always.
  2. Prepositions are not good words to end sentences with.
  3. Avoid cliches like the plague. They’re old hat
  4. Comparisons are as bad as cliches.
  5. Be more or less specific.
  6. Writes shouldn’t generalise.
    Seven. Be consistent.
  7. Don’t be redundant; don’t use more words than necessary; don’t be superfluous.
  8. Who needs rhetorical questions.
  9. Exaggeration is a billion times worse than an understatement.

View →


don’t pipe curl to bash

Unless you haven’t been installing developer focused 3rd party software recently, you will probably have seen the following command line used as a suggested way of installing a particular software package direct from the web:

curl -s http://example.com/install.sh | sh

This post is not here to debate whether or not this is a good idea but rather to make those that use this pattern aware of a non-obvious flaw, aside from all the obvious issues with piping 3rd party data directly into your shell. There have been countless discussions on this method and one argument for it has always been transparency - as in, you can simply check the script by opening it in your browser before piping it to bash via curl.

This post is here to a) show that this level of trust can be hijacked and b) to provide an easy way of protecting yourself when you wish to install via curl.
Proof of concept...

Continue reading →


SaltStack notes

Primitives

Minions

Minions: salt “clients”, aka hosts / provision targets. (not to be confused with the salt command-line client salt)

Master

master: the salt server, drives the provisioning of minions. the salt cli client runs on the master. The master is an ensemble of several services and worker processes.

  • Publisher (port 4505): which minions must be able to access for pull-mode
  • EventPublisher (IPC only):
  • MWorker: one or more “master workers”, which handle salt operations concurrently
  • ReqServer (port 4506): pop work and push to MWorker, plus receiving replies so MWorker doesn’t have to block
  • File Server (?): transfers files to minions on demand from the state tree

Grains

Grains are basically facts in the ansible/puppet world.

Pillar

Pillar is a global value/config storage, spelled out on the master. This is basically YAML which is laid out in folder hierarchies...

Continue reading →


Friends don’t let friends use BTRFS for OLTP

I usually write rant-style posts, and today is no exception. A few months ago I was working on a benchmark comparing how PostgreSQL performs on a variety of Linux/BSD filesystems, both traditional ones (EXT3, EXT4, XFS) and new ones (BTRFS, ZFS, F2FS, HAMMER). Sometimes the results came out a bit worse than I hoped for, but most of the time the filesystems behaved quite reasonably and predictably. The one exception is BTRFS …

Now, don’t get me wrong - I’m well aware that filesystem engineering is complex task and takes non-trivial amount of time, especially when the filesystem aims to integrate so much functionality as BTRFS (some would say way too much). Dave Chinner stated that it takes 8-10 years for a filesystem to mature, and I have no reason not to trust his words. I’m not a XFS/EXT4 zealot, I’m actually a huge fan of filesystem improvements (and I don’t really like EXT4 so much)...

Continue reading →


Theatre: Lolita

(@ London Theatre)

I recently (as of 20 minutes ago actually) attended a production of Lolita, a representation of Stanley Kubricks work (they say on posters).

I had gone in with no expectations, well, when you purchase tickets for “The London Theatre” online you expect something grandiose in the heart of theatreland.

However, this was not one of those. This was a “Fringe Theatre”, which I’ve never heard of- but I’m open minded enough, although it’s situated in New Cross (not exactly known for it’s cultural prowess).

When we arrived at New Cross Gate station we were invited to walk over a rather sketchy looking scaffold bridge between platforms if we wanted to leave; once we got outside we navigated through the even sketchier neighbourhood

I’ve walked through New Cross before (back when I lived in Lewisham) and back then I had been hardened from my time in Coventry however, I’m a...

Continue reading →