Forecasting Database Growth

Been thinking about the most accurate method for forecasting. More closely to my line of work, I’d like to forecast database growth over a specified period of time into the future?

The common method I have seen used is to simple use excel and plot a trend line from previously collected data, while this can be quite use-able in most scenarios, I don’t believe this captures everything such as seasonality. Seasonality here refers to the period data growth / purging that occurs. Most ignore this or are simply not aware but most databases will have some seasonal growth which needs to be added to the forecasting model.

The model used will be a 2 factor model of the form:

y = D + X + K

Where, y = the forecast database size, D = the seasonality variable, X = the stochastic variable with trend, and finally K is a constant of some sort.

Before we decide to identify seasonality and subsequently forecast, we need to gather some data (time series) In my case I’m going to be collecting hourly database size. Hourly simply because I want to retain a level of precision.

Lets use a simple Perl script (for portability, honestly anything that can remote exec a SQL query) which will connect to the database and execute some SQL.

See the code for data gathering in part 2.

VMWare Server 2 Multicore Performance

Noticed that despite the fact that one of my VM’s (Win2k8, 64Bit) was configured with 2Gb Ram and 2 CPU cores the performance, even while idle, was not as expected. It was sluggish  and so unresponsive that the snap-in loader timed out all the time. This was very frustrating, I increased the RAM, Pagefile,  name it but still the same issue.

So I decided to allocate just one CPU core. The performance has increased significantly. I’m going to investigate what the issue was, maybe the guest OS’s were starving the host? I will delve deeper in another post.

Don’t want to learn Perl, Python, Shell … Ok, try On.Inno

There are people on this planet earth, wonderful as it is, that don’t want to learn about the glue that holds all things together. And by glue I’m referring to the workhorse scripts that never really get the attention they deserve:

  • Perl
  • Python
  • Shell

One cannot truly say they have experienced a production infrastructure with at least typing one command from any of the above.

Well for those who are not ready for this level can jump to the next level. I have something for you called on.inno.

on.inno is a high level scripting programming language that has 2 logical operators (IF …ELSE) … that’s it.

It’s simple, more to come.

Example:

EXECIFSTART<,>hostname<,>piroserv2<,>1

EXEC<,><!>This is a comment: embed any of thing here Perl| Pyton|Shell><,><,>1

EXECEND<,>echo if<,>if

Why did I create this simple language … I work in an environment that has different versions of OS’s (Linux, Windows, and Solaris), and not all of them have the same components. To save my self from having to try and say install Perl on all of them and simply drop some code there (ideal world) I simply created my own interpreted script to interface with the remote OS.

use Parallel::ForkManager

A very powerful and easy to use Perl multitasking module.

Used in every single code I now write, as it’s annoying to continuously  update code just to make it scale-able. In my scripts I usually like to “fork” things off especially say when manipulating data.

With this module it’s as easy as a foreach loop:

use Parallel::ForkManager;

use PiroLabs::Utils::DataCruncher;

my $max_threads = 10;

my @tasks = qw (x / + -);

$pm = new Parallel::ForkManager($max_threads);

foreach (@tasks){

$pm = new Parallel::ForkManager($max_threads);

my $pid = $pm->start and next;

my $dc = new DataCruncher([0 ... 2000000], $_);

$dc->Execute();

$pm->finish;

}

$pm->wait_all_children;

In the snippet above I want to perform multiplication, division, addition and subtraction all at the same time on some defined data set.

Next post I will show how to pass data from the child processes back to the parent.

Raspberry PI: Static WiFi IP

Impressed as I was, the pi worked with the WiFi gizmo out of the box, once I setup the keys and SSID from the desktop. But really DHCP is not ideal especially if you want to use only the ssh server (don’t care much about the mouse and desktop fluffy stuff).

Ran into a roadblock trying to enable static IP, searched high and low but could but configure a static route for the USB Micro WiFi dongle (belkin).

Finally got it working by updating the default interfaces config, changed the default from dhcp to static and added the ip properties:

auto lo

iface lo inet loopback
iface eth0 inet dhcp
allow-hotplug wlan0
iface wlan0 inet manual
wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf
iface default inet static
address 192.168.2.120
netmask 255.255.255.0
gateway 192.168.2.1

Hope this helps someone in the same boat.