Skip to main content

Open Source visual web scrapper – Portia you are looking for a scrapper, ie to crawl and extract data from a website for Price Management, Marketing, update on football score, news clipping, etc…. the Scrapinghub team have announced the release of Portia, an open source visual web scrapper.

Portia allows you to extract data in a very intuitive way without the need of programming knowledge. You just need to annotate web pages and indicate what data to extract (From the visual editor) and Portia will learn how to scrape similar pages from the website.

Check out the demonstration video:

You can download Portia or contribute to the project through Github here.


Bandwidth restriction – How to limit the download speed for your visitors a terrible thing when you haven’t a very high speed upload rate on your personal server at home or VPS or even cheap dedicated server, and that 1 or few users are saturating your bandwidth (By genuinely “consuming” your content such as photos, videos, or file to share, … or even by some script kiddies downloading in loop one of your picture from your blog or so).


The best thing is obviously….to have a good upload 🙂 but as you may not be able to decide that, the next good idea is to set a bandwidth speed limit.


The Mod_bw module of Apache can help you to set restriction per Virtualhost (Different websites or services), per IP, per type of file and even per file’s size.



To install the mod_bw module, just type:

and enable it if not done automatically at the end of the installation



You can directly write your configuration inside the vhost of your choice.

If for example you want to set a limit per visitors, type of file, and size of file for your website “myblog” but without restriction on any request from your local network, edit the related virtualhost:

and add the following at the end, before the </Virtualhost>:

Obvisouly you can set as you wish. Me I used to filter only 2 levels, Bandwidth All and LargeFileLimit *

Save your file (CTRL+X, then Yes) and restart apache

Linux Apache MySQL PHP (LAMP) Server to host our own website

The most popular configuration to host our own website is certainly the Linux + Apache + MySQL + PHP combination (Alias LAMP server). Although it is possible to do the same using Windows (WAMP) or with others applications too, I will only focus on LAMP.

Apache is the most popular HTTP server; coupled with the world’s most popular Open Source database, MySQL and the very popular scripting language PHP, you should not doing a big mistake here using this combo.


To install it, just install all these applications:

During the installation, you will need to set MySQL root password and to select the Web Server to reconfigure (Apache in our case)



Either you want to build your own website from scratch or you could use some Content Management System (CMS), like WordPress, Drupal, or Joomla, which is like a template that you can customize and fulfil (Like this blog).

In any case, you will need to build a specific folder in /var/www, install what ever you want in it and provide a Virtualhost to inform Apache to deliver this folder depending on the url asked, as briefly discussed in my previous post to help you set a Static IP and explain the server/router should response.

Let’s say you want to host 2 websites, one blog and one photo gallery, here is how you should do:

1) Create the folders

(You can call the folders as you wish)

2) Add whatever CMS, services you want inside

3) Give proper permission to these folders

As we have created the 2 previous folders using sudo command, the owner is now root…which will cause access issues. To avoid that, we need to change the owner of these 2 folders by www-data.

chown is to change owner, -R option is for recursive, if you want to do it on full folder, you need to add -R, and www-data:www-data is to assign the folders to the user AND group www-data.

4) Create the needed virtualhost (Vhost) files

The apache virtualhost files are the key to know which folder/service to deliver when an user request a certain domain name.

For example, you may want that redirect to the folder /var/www/myblog and that the sub-domain redirect to /var/www/gallery or may be to have a second domain name directly, it’s in fact not very difficult with the Virtualhost.

You will need to create the conf file in /etc/apache2/site-available and then active this virtualhost (Actually you could directly create the conf file in /etc/apache2/site-enabled, thus already active but this solution is less flexible if you want to turn on/off sometimes some websites)

– Create the virtualhost:

In my presented case, the Virtualhost of that will redirect to the folder /var/www/myblog will look like (With comments after the ###, you could remove them if you want)

The remaining lines will not need much changes normally.

And you could do the same for your second website

– And now activate your 2 virtualhosts:

where myblog and gallery are the names of the virtualhost files created previously.

And finally restart apache:

If no error message, you should be all set.

PS: Don’t forget to create an A redirection on your domain name registrar to your IP and allow connection from port 80 into your router.

Exploits of a Mom

Host your own server – Where do we start?

So you wish to install your own server to run may be a website or your own mail, or a specific application or service (Subsonic? Minecraft?…)

You will obviously have few requirement to match based on your needs.



You could have a dedicated server using OVH or any other provider, but I’ll assume your here to use your own hardware and host it at home.

In fact, a server does not need to be very powerful, so you could reuse an old laptop or computer if you want. For example a Rapsberry Pi (Based on Arm with 256Mo Ram) is enough to host quite a few services. But don’t except high reactivity tho.

My first dedicated server@HOME was a custom ITX (Small size) config based on:

Case: Thermaltake Element Q

Motherboard: Intel DG41MJ (ITX socket 775)

Processor: Intel E5300 2.5Ghz


250GB 2.5 7200tr/m Hard Drive

Paid 250e 4 years back

And I had a very good experience with it and I was hosting few websites with modest trafic (few hundreds per day) and dozen of services such as Subsonic, Ajaxplorer, FTP, Mails, …. No need to be much faster in fact.

Although my config now is way too powerfull for my needs (But it was my own gift xD)

I’m now having a i7 2600 with 16GB Ram + OCZ Vertex 3 SSD 64gb + 2x2To Storage (For duplication). I actually really enjoy using SSD in my machines now (Fast load time, very good performance with MySQL databases or heavy I/O tasks)


Obviously the faster your Internet connection is, the better but I would say there is no specific minimum, it will just limit your type of services and traffic.

If you could have at least 128kb/s (16ko/s) upload speed with your ADSL, that would be a good start. Download will not matter much usually, as upload is always the bottleneck in ADSL. (If you are having cables, ADSL2, VDSL2, or even FTTx, lucky you. In that case you will probably be very comfortable with upload)The server described earlier was on an ADSL2 connection at 16M/1M.

Now I’m having FTTB with 100M/40M (So much faster…indeed)

Another important aspect of your network will be your router, to route correctly the needed traffic to your server. You will need to open several ports to let enter the traffic.

Operating System:

GNU/Linux is THE Operating System for servers. Widely used, very stable and with good performance, it is a good choice to run your server on.

In the GNU/Linux family, it exists a lot of “flavor”, Ubuntu being the most popular and very easy to handle. Ubuntu has a dedicated server version called Ubuntu Server and will run quite well. But although I’ve started with Ubuntu Server, I’m quickly moved to Debian and could only strongly recommend you to give it a try.

Ubuntu being based on Debian, you will not feel much the difference as a server version. However I felt Debian to be much more stable and reactive than Ubuntu. However Debian got 3 majors branches (Stable, Testing and Unstable) with different version of application. Stable being based on very robust and tested set of application, while Testing has more up to date and Unstable being cutting edge version, with possible bugs for these 2 versions.

You want to play safe? I suggest you to use Debian Stable and if an application is not up to date enough, to install a more updated version from backport repositories.

How to redirect 1 domain name to another and correct URL bar

If like me, you want to redirect your old domain name to your new one while correcting the visitors’ URL bar, the solution is quite simple in fact using your VirtualHost.

Here is my example, I wanted to redirect and (Without www) to directly and make sure the domain name change in the visitors’ URL bar, you need to tweak a bit your Vhost located in your


to add the domain name you want to redirect at the beginning of your existing VirtualHost.

Here is the interesting part of mine:



First, the server will answer any request on the port 80, the default HTTP port (*:80) for the ServerName or the Alias and will redirect any folder called (/) to the website

Like that, if you call, it will redirect to Or if you call, it will redirect to

Easy right?