corner image corner image
corner image corner image
corner image corner image
corner image corner image
corner image corner image
corner image corner image
corner image corner image

Archive for the ‘Google / Internet / Programming’ Category

Opteron temperature on linux

Monday, April 30th, 2007

Opteron cpu temperature using GKRELLMI’ve been looking around today on how to find the temperature of my new Opteron processor. I’m using that as a webserver and from what I learned on the internet that it can be over-clocked to 2.5GHz no troubles, and I read in other places up to 3GHz. The motherboard is one big factor in play, and also the temperature of the CPU. As I understand, it should always be around 32 Celcius.

Today I found this amazing tool called gkrellm, which I don’t know what it stands for. However, it has a very nicely designed GUI that shows that loads on every core as well as the CPU temperature. I was able to figure out that I’m running the CPU way below 32C, it was at 22C i.e. I had 10 more degrees to go.

One of the things I also learned while researching over clocking is that a big factor is the load on the CPU. If someone is playing games, this would actually mean that the CPU is heavily loaded. At around 80,000 page views a day the dual core Opteron did not pass 4% while monitoring it using top, and so there is still a big room to go - although it seems like I really don’t need it :) .

Technorati Tags: , , , , , , , , ,

Upgrading the webserver

Monday, April 30th, 2007

Last week I upgraded the machine running my webserver. Originally, I had an Athlon MP 1.1GHz with 700MBytes of RAM. I bought an Opteron 165 (1.8GHz) dual core now running on MSI socket 939 motherboard and 1GBytes of Ram. The performance difference is dramatic. I learned a lot during this upgrade. The older processor was 32 bit, and the new one is 64 bit. Of course my first attempt was to install Debian 64 bit release. I downloaded the network installation CD, burned it and tried to install it. Everything went well till the end of the installtion when the script said “Cannot find a suitable kernel” !.

At that point, the machine has been down for at least three hours. I then decided to install Ubuntu, which is the closest operating system to Debian. I had some problems earlier with Ubuntu and at that point my choices were really limited. I had the 64-bit CDs of hoary, I installed a minimal installation and upgraded to feisty seamlessly and without a single problem.

I used apache2, fastcgi, php5 and mysql from the feisty distribution, and compiled postgresql myself. Soon in the future, I will be running all of them out of my own compilation, which usually provides the best results and makes it easier to keep with the latest minor realeases.

Technorati Tags: , , , , , , , , , , , , , , , , , , , ,

pdnsd versus dnsmasq on Debian for DNS caching

Tuesday, April 24th, 2007

Most likely my experience will be similar if I tried it on Ubuntu. I used pdnsd for a very long time for dns caching, almost 6 months. The speed gains one will get from DNS caching is tremendous especially if you are running a server that mirrors or copies content from a remote server periodically.

My real estate website hosts almost 900,000 pictures for the listings displayed. Every listing has between 1 and 20 pictures, in three resolutions. Mirroring that number of pictures without a DNS cache requires for every request a name server lookup outside my local network, which adds up to almost three weeks to copy that much files. By using a DNS cache things go much faster, so the same amount of data can be downloaded in three days approximately.

pdnsd is an awesome package. My troubles with it were very specific: memory usage. I need the memory for the database and apache serving the requests, and I don’t want to waste it with pdnsd although I need pdnsd. It used to run on almost 30MBytes of memory, and I had no justification of why it needs all of this when I’m limiting the maximum time to live for its cache, and limiting the memory cache to 64KBytes.

After lots of trials with its config file, I decided to look for an alternative. dnsmasq was unbelievable to me. I didn’t configure a thing. I just installed it using apt-get. It caches the dns requests with a response comparable to pdnsd. Memory usage is almost non-existent. I use gmemusage to display the resource usage graphically, the its no where on the display.

mosama@debian:/home$ dig blogflux.com

; < <>> DiG 9.2.4 < <>> blogflux.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER< <- opcode: QUERY, status: NOERROR, id: 15760
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;blogflux.com.                  IN      A

;; ANSWER SECTION:
blogflux.com.           14400   IN      A       204.11.52.71

;; Query time: 61 msec
;; SERVER: 192.167.0.110#53(192.167.0.110)
;; WHEN: Tue Apr 24 20:43:32 2007
;; MSG SIZE  rcvd: 46

mosama@debian:/home$ dig blogflux.com

; <<>> DiG 9.2.4 < <>> blogflux.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER< <- opcode: QUERY, status: NOERROR, id: 9201
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;blogflux.com.                  IN      A

;; ANSWER SECTION:
blogflux.com.           14394   IN      A       204.11.52.71

;; Query time: 2 msec
;; SERVER: 192.167.0.110#53(192.167.0.110)
;; WHEN: Tue Apr 24 20:43:38 2007
;; MSG SIZE  rcvd: 46

The speed of response of the cache is very fast. The first dig to blogflux.com took 61 msec because its a cache miss. The following one took only 2 msecs.

gmemusage after installing dnsmasq

Technorati Tags: , , , , , , , , , , ,

Reduce MYSQL memory usage

Thursday, April 19th, 2007

I’m using Postgresql as the main database for my website. I’ve reached this decision after a lot of problems with MYSQL. That however doesn’t mean that MySQL is totally bad, its just not the correct product for my needs. People who are not very savvy or do not intend to research or read a lot in the docs of that database, should use MySQL first. It is much simpler to install and admin.

Although I switched to postgres for good, some services on my website are still running on MySQL, namely twatch for stats, and Wordpress blog. I will be moving Wordpress to postgres in the near future, however, I will still need to continue using twatch for a while until I implement something close that works with my website on postgres.

The problems I have been having lately is huge memory consumption with MySQL. Actually, it used to eat around 100-150MBytes of my servers memory. I’m running this website on an Athlon MP, 1.1GHz, 700MBytes RAM machine. MySQL was basically choking postgres, and I needed that memory to enhance postgres’s cache.

I played with my.cnf configuration file a lot and nothing seemed to resolve this problem. yesterday, I noticed a line skip-innodb somewhere in the file, and a comment beside it You might want to disable InnoDB to shrink the mysqld process by circa 100MB. That line did the trick. MySQL is now using around 10 megs of memory.

I checked both twatch and Wordpress. Both of them seem to work fine after this change, and they are not slower.

Technorati Tags: , , , , , , , , , , , , ,

postgresql 8.2.3 rocks

Friday, April 6th, 2007

I compiled postgres 8.2.3 with full optimization -O3, and its superfast. Couple of things that I noticed:

  1. If you configure without openssl, tests fail. I got everything working fine when I configured –with-openssl.
  2. When I was moving my data from 8.1, first I used pg_dump and I dumped the whole database. Things didn’t work becuase tsearch2 config tables were overwritten. So, I got to dump one table at a time. Things didn’t work either because the new psql for some reason does not understand \N as nulls. The only way I got it to work is by using -D -c with pg_dump, which means use full inserts and drop tables before attempting to create them.
  3. Last, after I finished moving the data postgres was severely slow. I tried to understand for more than 2 hours whats wrong with it, I even recompiled a couple of times! At the end, I found that I needed to do a vacuum full analyze after inserting that much data.

I noticed that it is now using much less CPU than 8.1, and after I re-indexed the full text search with GIN instead of GIST I got a super performance boost. Queries that used to run in 1.5 minutes now takes approx. 1 or 2 secs.

Technorati Tags: , , , , , , , , , , ,

Trace watch (twatch) web stats, MySql and Postgresql

Wednesday, March 14th, 2007

Every website owner tracks his stats, and analyzes them. Various tools exist to do that. I personally use google analytics, trace watch from tracewatch.com, and I use two tools provided by my web hosting company namely awstats and webalizer. Using everyone I can see a different set of stats, however the two that together gives a complete picture are google analytics and twatch.

Google Analytics is excellent. Some people don’t use it because they don’t want google to see what’s happening on their websites. It has one deficiency, it does not count robots hits. Its advantage is tracking users using cookies, so the same visitors coming returning to the website after two days are tracked and counted as returning visitors. Trace watch counts robots hits, but on the other hand does not count distinct unique users using cookies.

Trace watch has other set of problems when it comes to scalability, and more than once broke my website and its problem is mainly due to using MySql which also does not scale well (see comparison on intel xeon by tweakers). The problem usually happens as tables get filled.

Robots dily hits sample

This is a screenshot sample of my daily robot hits. I receive around 70K hits a day from robots, other than from users. Usually after about 3 months, MySql 5.0 starts timing out queries from twatch, and sometimes does not accept connections as the speed of incoming queries is more than its speed of response, so it eventually gets flooded. As we all know, MySql is fast only if there is no simultaneous queries, and if you keep your tables and indexes of reasonable size.

Because I pool connections the php cgi just returns with an error, so all pages on my website stop working ! What happens is that the mysql php extension tries to reuse the connection it thinks its open, but mysql rejects from the other side - so the php cgi does not start.

I believe that tracewatch can be better, and for that it needs:
- Use postgres instead of MySql. All my experiences with MySql were not successful in the large table sizes.
- Use cookies to track visitors. PHP provides sessions, which can easily be used for tracking
- Stop obfuscating the code, and release it as an open source application.

I have searched numerously for a web stats program that uses both postgres and PHP, even if it was just starting and we can work together to make it useful, since I believe we really need one especially at the higher end, and for websites that anticipate larger growth.

Technorati Tags: , , , , , , , , , , , ,

Vista sales tank because of piracy

Saturday, March 10th, 2007

Ballmer’s opinion was described in an article that vista’s sales tanked because of piracy. The same article also describes that Microsoft’s plan to increase the sales in emerging markets by tightening on the windows licensing.

I believe that Microsoft WILL SLIGHTLY increase sales in the US, Canada and Europe by tightening security, however in the emerging markets IMO the price of windows is outrageously high compared to what people make. Consequently businesses are going to buy but not the end user, unless Microsoft OEMs windows for pennies, which won’t add any revenue.

Ubuntu pictureLet’s look from another perspective and see what people really want from the OEM industry, for example on Dell’s ideastorm. The most wanted feature is to OEM laptops and desktops with linux distributions. What’s amazing is that Dell is listening.

Re-iterating again - Will tightening windows licensing increase Microsoft’s sales? Wrongo its listening to people that will increase Microsoft’s sales! I really hope that HP and Toshiba as well as major laptop distributors take the same route as Dell and start providing laptops and desktops with open source software, including linux options as well as openoffice.org. And I really hope that Microsoft will interact with users, listen to requests and stop spreading their efforts thin chasing every other product in every industry, and focus on OS to produce a good quality operating systems that meets people’s requests.

Technorati Tags: , , , , , , , , , , , , ,

Open house management software

Tuesday, March 6th, 2007

Did you ever conduct an open house, and had the people signing in on a sheet of paper? Have you ever tried to read what your visitors wrote… and of course got amazed by their unreadable handwriting !

Well, I made a nice, easy and free solution to this problem. A small program that runs using openoffice base. The whole idea is to run that program over you laptop instead of using the old way of paper and pen.

Download

Open house managerSimply, download openoffce from here, then download my program Open house manager from here. Unzip the file, and double click on it. There is one form only, click on that form it should display a picture like thumbnail. You will register the description of the open house in the upper sheet, while the visitors will register their names in the lower sheet. By that way you will get to keep a readable record of who came in your open house. All ideas, comments, requests are welcomes.

Since I am affiliated with Exit Realty Plus, you’ll find Exit Realty Plus logo in the form, however since the program is free as defined in the GPL license, then you can add your own logo in your version.

License

This program is released in the hope that it benefits others, I am not giving any guarantees or warranties of any type. The program is released under GNU Public License AKA GPL version 2, or at your discretion a later version. If you plan to distribute this program or change it, please read the license and understand it carefully.

Technorati Tags: , , , , , , , , , , , , ,

Wordpress security problem

Saturday, March 3rd, 2007

Thanks wordpress team for fast advertising.

I found this message today on my Dashboard of Wordpress WordPress 2.1.1 dangerous, Upgrade to 2.1.2. It looks like theirs a code insertion bug as I saw a post also on webmasterworld.

I upgraded, and it was straight simple. Since I rely on Wordpress heavily, I will bookmark their RSS. I think that’s the fastest way to get notified of security problems !.

I have a script that runs daily and backups my sql. So I unpacked the files, checked out the website from my cvs, over wrote wordpress, and updated my online version.

Technorati Tags: , , ,

Windows Vista, Linux

Tuesday, February 20th, 2007

Microsoft had to come with a new piece of software after all the work they did the last five years. The main idea behind vista is tightened security, digital media rights and a better look.

I don’t think that Vista will be the turning point for microsoft’s downtrend. It is rather a pull of ear to make them wake up, and make the rest of the Microsoft-only resellers like Dell also wake up.

Microsoft will not be selling new Vistas, but rather delivering them as OEMs with new computers. I believe that resellers will continue to provide XP for those who don’t want Vista. I hope that HP, Dell, Toshiba and other laptop and desktop resellers include machines with Linux and MacOS.

I believe if Microsoft wasted those five years in enhancing their terminal “cmd.exe” and the terminal tools to offer a more comprehensive *nix like solution, that would have been much better.

Did any one face the problem of removing a number from the name of 4000 files ? The only way to do it on windows is by writing a VB program. There is no tool natively on windows that enables one to do it - unless you install cygwin. Having a way to do things is important. Not having any way to do it at all is bad!

Technorati Tags: , , , ,

How to make your website visitors stay longer

Tuesday, February 13th, 2007

This is a very frustrating topic. One can keep trying to learn what makes his website visitors happy for a very long time. Obviously, modifying your website to suit your visitors is a topic specific to the content you present. However, there are some general considerations. In this post, I will keep updating my experiences plus add some links to relevant or similar content.

My first experience is:

fonts

Lately, I switched the website fonts from Arial to Arial rounded bold. I lost 20% of the pageviews per visitor ! Amazing effect. I switched it again to the the same used in moodle.org, and I’ll see what happens in the next few days.

My website traffic increased back 20%. Visitors are now staying 20% more than before in average. In average, I get 5 page views a visitor now.

Speed of page

Here are two specific experiences with speed.

  1. Everyone talks about moving from tables to CSS, and that it will increase your page download speed. I agree in general with that comment. In specific, if your CSS was a big file, with lots of sophisticated features it will obviously take your web-browser more time to render the page. So keep your CSS as simple as possible - this increased the page SHOW time a lot for me.
  2. Use compression with PHP. Most of the websites out there are using PHP hosting. You can include a local php.ini file. In that file make sure you have the following lines:

    zlib.output_compression = On;
    zlib.output_compression_level = 1;

    then look at the resulting phpinfo, by writing a simple page < ? phpinfo(); ?>, and make sure you see those options on. The compression level is anywhere from 0 to 9. My experience, and for my website 1 compressed 30% and did not overload the CPU. Increasing it more than 1, the php processor appeared on top eating 4% or more for every page request and added very little compression.

Doing those update, increased my page views up to 7.8 per visitor (appears as 8 on twatch).

Still under construction, PHP caching. Caching is excellent since it significantly reduces the amount of time used by the database queries. I tried a very simple way by md5 hashing the incoming URLs and saving the contents in files named by the md5 hash. The result is that I discovered how quickly can the md5 hash generate collisions !!

Website bugs

I still have PLENTY of website bugs. I know them, and I’m keeping track of them. Those kind of bugs are not security bugs, but rather bugs that can cause user frustration because of unexpected behavior. Cleaning up those bugs should further help to reduce the visitor frustration, and that’s my next goal.

Links:
http://www.web-money-advices.com/index.php/how-to-make-your-visitors-stay/2007/01/13/

Technorati Tags: , , , , , , , ,

Excellent CSS website

Monday, February 12th, 2007

This webpage has excellent CSS examples. The author basically covered almost all tricks one can do with CSS including pure pop-ups.

I think I will use some of those pure pop-up techniques to make the filters more tolerable. Currently the way filters are displayed on pages make it hard to use - and sometimes very hard to notice.

Technorati Tags: , , , , ,

Moving large number of files over SSH

Monday, February 12th, 2007

Moving a large number of files across the network is a very time consuming process. I tried to copy all my PhD work from Miami to Maryland. The total size compressed is 2.1GB. It uncompresses to a huge number of files, numerous source code, lots of small images, many pdf and PS papers.

I tried moving the directory tree with fish over KDE. The last time I did that it was between two computers in the multimedia lab and it took two and half days. Figuring out that I’ll be moving that same structure over the internet obviously would take weeks using KDE. Compressing the files over the hardrive then moving them is out of question since the hardfrive over there did not have the space.

After research, I found that one can pipe the output of tar over ssh, so basically tar -czf - ./ | ssh home tar -xzvf - would do the job.

Technorati Tags: , , , , ,

Recursively adding files into cvs (Recursive cvs add)

Sunday, February 11th, 2007

Adding cvs files can be cumbersome especially if they are in a directory tree. The cvs tool although provides a lot of functionalities, yet recursive addition is not there.

There’s a couple of ways which one can achieve this only on linux or unix environments and that by either using xargs or awk utilities.

Using XARGS: find . | xargs cvs add
The only problem with that all the files found will be concatenated as a long string and passed after cvs add. After several trials, I found that not always this works out. An example is when one of the files has a space in its name.

Using awk: find . | awk '{print "cvs add \""$0"\""}' | bash
This one simply writes every line using awk, and passes them to a shell to be executed. This one always worked for me correctly.

I always use CVS over SSH. One wonderful addition that can make your life much easier is to use key authentication instead of password authentication. By this way cvs will not prompt you for the password for every action it attempts. Use cvs-keygen to generate a pair of keys on your local machine. Copy the id_rsa.pub to ~/.ssh/authorized_keys in your home directory on the server.

Technorati Tags: , , , , , ,

Website updates complete

Saturday, February 10th, 2007

Completed updating all pages to work off graphics from the subdomain. Switched to wordpress 2.1. Learned a lesson when the hard drive storing my database crashed, so I bought two SATA hard drives, one is storing the database and media, and the other gets mounted once a day, all the databases pgsql, mysql gets dumped and gzipped there plus all my cvs - then gets unmounted.

Technorati Tags: , , , ,

Website updates

Sunday, January 28th, 2007

Last week I moved the nameservers from bluehost to everydns. Shared hosting is a very good idea when you’re starting your website. Bluehost is actually one of the best hosting companies. However, as the website traffic grew I had to my my postgres database out of there, and host it at home. Currently, I’m hitting close to 5600 visits a day according to webalizer and with the way my website is setup, this generates 70K page views a day in average. Since every page requires two or three database queries to show, and some graphics to download from the MLS too to show the listing, things are becoming really slow.

Now, I changed the way the website pages work. Graphics are fetched upfront, and saved locally. I created another subdomain which I only used for graphics media.mibrahim.net and I host it in my basement over verizon fios. So the pages hosted on bluehost will not longer fetch the graphics and should now be loading faster.

Technorati Tags: , ,

cvs over ssh

Friday, January 26th, 2007

I always used CVS over pserver. Lately, I faced a problem with UBUNTU edgy. The cvsd wouldn’t execute checkouts. It only executed updates, commits and diffs. Basically it did everything except checkouts.

I tried cvs over ssh, and I actually found that it is much better than cvs over pserver. Although it is slower, yet, I don’t have to run a special service for cvs and the way cvsd operates, is by actually jailing cvs pserver due to security reasons. So it made more sense to me to run it over ssh.

To do that, one will need to set the variable CVS_RSH to ssh, and CVSROOT to :ext:UserName@host:/cvs/directory . Whether or not you can read the repository, or commit to the repository is determined by your account priveledges. In the old pserver way, all the repository belongd to the user cvsd and the group cvsd.

Technorati Tags: , , , , ,

Launching H & H website

Wednesday, December 27th, 2006

Today I hosted the website for H&H in my basement computer running Debian + postgresql. The machine has been running for a while successfully. The trick behind that website is using the redirection provided by everydns.net . If you’re using Verizon business DSL or fios and have port 80 blocked consider everydns.net redirection.

The machine H&H should be converted to UBUNTU soon. I’ve been playing with UBUNTU and I like the fact that I have a pretty wide library of binaries, exactly as I have in debian however, with almost all new releases. My problems with Debian began when I tried to install PHP5 on the stable release. I ended up messing the installation because PHP5 is only available in the testing, and even the parts available in the testing branch did not cover all of the PHP5 basic libraries !. For example, I didn’t find the php cli package. When installed, and given that its testing, it did not correctly modify the config files in /etc and consequently didn’t run. I found myself really doing a lot of work that is already done for me if I switch to UBUNTU, and that’s why I will pick one calm night, and perform the upgrade.

Technorati Tags: , , ,

PostgreSQL and the CPU hunt

Wednesday, August 2nd, 2006

Its a continuous, non stop problem: How to reduce the CPU usage of queries?

I did so much work on optimizing the tables, creating indicies, showing postgres that queries are REALLY of the same nature as the indicies I created and convincing it to use a certain index. However, postgres is different from mysql. In MySQL one can write an sql select statement then force the DBMS to use a certain index during he query plan. In Postgres however, this is not possible. If the DBMS is not convinced that your query matches the index you created, it will do a sequential scan - even if you disabled the sequential scans.

What I found is that if you have a table like
TableName( col1 bigint, col2 character varying(32))
and an index idx(col1), then run a statement like
select * from TableName where col1=1
the EXPLAIN command says always that it will use a sequential scan. The caveat is that the DBMS understands the ‘1′ in the query as an integer and not a bigint, and so the index is incompatible with the query !!! - I’m using 7.4

The solution that I found for this is
select * from TableName where col1=1::bigint.

The second biggest optimization that I did was for queries like
select code from tab1 where key is null.
Whether or not there is an index on key the planner uses sequential scans. Again, the solution that I found is using a code which means null. So for example if key was integer, and we know that is always greater than 0, use -1 whenever you meant null. By that way the query will be
select code from tab1 where key=-1::bigint
if key was bigint. Only then the query planner uses index searches.

One of the very best ways that I also found today is to create a log file with the queries that you execute. The first step in optimization is to have real queries. The second important step is to use explain on real data. I created a function called query in my php scripts. This function based on a global flag will keep writing every query before it executes it in a log file.

One of the things that I noticed but did not optimize yet is queries that gets executed more than once in the same session. For example looking for a certain code in a listing. The way I’m imagining how I can avoid the double call to the DBMS is caching them in some stage, and checking the value before executing an expensive SQL query.

I’ll be focusing now on the interface. I’m planning lots of changes that would make the site more viewable specially on 800×600 screens, which I found that I’m getting lots of hits from people that are using them. Basically, close to 70% of people using my website have their screen resolutions 1280×1024 and less. 53% 1024×768, 10% 800×600 and 13% 1280×1024. The planned changes include a lot of CSS coding the remove a huge part of my HTML theming for the website.

Website updates

Wednesday, August 2nd, 2006

If you used to website now, you will be able to tell that it is way faster. All queries have been optimized, the database optimized, all PHP code optimized !

One important thing that I noticed that all my queries used upper(column)=’value’ however, all my indexes were straight on the columns and so non of the queries utilized the indexes and used straight table scans.

After adjusting this, the cpu load from the postgres is almost zero all the time with spurs evey now and then.

Lots of new regions added. The latest region list is:

M A R Y L A N D

ANNE ARUNDEL-MD

BALTIMORE-MD

BALTIMORE CITY-MD

CARROLL-MD

FREDERICK-MD

MONTGOMERY-MD

HARFORD-MD

HOWARD-MD

PRINCE GEORGES-MD

QUEEN ANNES-MD
V I R G I N I A

ALEXANDRIA CITY-VA

ARLINGTON-VA

FAIRFAX-VA

FAIRFAX CITY-VA

FAUQUIER-VA

FREDERICKSBURG CITY-VA

LOUDOUN-VA

PRINCE WILLIAM-VA

STAFFORD-VA

corner image corner image
4,269 spam comments
blocked by
Akismet