2010-12-14

A review of digiKam

 

 

After an epic (but failed) struggle to get Canon EOS Utilities to work on XP on Virtualbox, I started looking around for a suitable alternative on Ubuntu. Quick googling took me to a fantastic page on photography-on-the-net

The page lists a number of options, such as:

  • digiKam
  • Shotwell
  • F-Spot
  • Google Picasa
  • gThumb

I have already tried F-Spot (which is incidentally the default graphics editor on Ubuntu) and hate it, I decided to try other alternatives. In this post, I will discuss my experiences with digiKam

First up, the page on photograpy-on-the-net dampened my enthusiasm since it only has 16 bit / channel support for color depth. Looking past that I decided to go to the homepage. Although there's a link for "Download", I found a handy "blog" entry referenced on the homepage titled "Install digiKam 1.6 on Ubuntu 10.10". Perfect!

However, the very first response on that blog entry is a warning - apparently, the repository for digiKam not just digiKam but other software that a user may not want to upgrade. Yikes! The very first thing then was to check the packages in the PPA. Again there was a handy link for the PPA description. Quick glance revealed nothing serious that my system would be affected with, so I decided to take the plunge!

Installation is straightforward:

sudo apt-add-repository ppa:philip5/extra
sudo apt-get update
sudo apt-get install digikam

After the installation, I fired up digiKam and was pleased to find a decent enough interface. However, it is very irritating that it first wants me to create an Album. Once an album is created all pictures that I subsequently select for download get added to that Album. I couldn't find an option for it to create the album automatically (for example, by date). All pictures in an album remain in the same folder. This isn't too pleasing a functionality for photogs like me, who may have a lot of content (taken over days) on their camera.

I was also somewhat disappointed that it didn't automatically detect when a camera was attached to the PC, and the navigation menu is slightly confusing - the very first menu listing is for "Albums", while I expected it to be "Image" (or something to do with Images).

However, when a camera is connected, it does show it under the "Import" Menu:

It did connect to my Canon EOS 5D Mark II and I could see the pictures:

Irritatingly, it shows the last photo first (which can be bothersome if you downloaded some pictures off your camera earlier, and want to subsequently download the rest/newer pictures). But that's easily fixed by toggling the "Show last photo first" in the View menu:

At this point I find the UI begins to show its limitations. You cannot right click in the list of pictures to make a selection. There isn't a checkbox (like the one provided in Canon ZoomBrowser) to select photos. You have to manually select (Ctrl+Click) the pictures you want to download. There are however menu items to select all, etc.

Select the pictures you want to download and the transfer process begins. Note that it doesn't automatically track the pictures you have already downloaded, so you can end up with duplicates (though it does hae a function to help you get rid of duplicates!)

Another interesting feature in this menu is "Capture":

However, I couldn't figure out how that capture truly works. Sometimes it takes one picture, sometimes two. Additionally, as the picture above shows, the captured picture doesn't seem to be right, as visible in the thumbnail with caption "capt00000.rc2".

Once your pictures are downloaded, you get the "Album" view, with the list of albums on the left pane, with the pictures in each album on the top pane and a larger view of a selected picture in the lower pane:

However this is where the GUi gets quirky. As soon as you click on the image (which I have to admit, I end up doing quite a bit - perhaps because I'm so used to it), the view changes to the thumbnail view:

Additionally, even in the picture preview mode, I find the navigation is quite difficult. There is a tiny arrow to navigate back and forth between your pictures (which is quite easy to miss, and it ends up taking you to thumbnail view). There is an option to view your pictures as a Slideshow (either all pictures in the album, or a selection), however; when I selected a few pictures to view a Slideshow, it only shows the first picture in the selection in the Slideshow!

There is a cool feature to find similar pictures based on a fuzzy search with an acceptable level of threshold. This fuzzy search is similar to one by ImgSeek and is based on the Fast Multiresolution Image Querying paper. When you select the option to find similar the first time, it asks to build an index. Once the index is completed, you can then select/modify the acceptable threshold and an image for which you want to find similar images:

In the screen capture above, there are 2 matches with 70% threshold. However, if the threshold drops down to 50%, it picks up ALL the images similar to this (even the one that was out of focus!):

Pretty cool! I suspect this feature will be used quite a lot! Another way of searching for similar pictures is by drawing a "sketch" for a fuzzy search. This search seems to be a search based on the pen size and colour (which probably determines the area over which the chosen colour is spread in all the images in the selected album):

Lastly, you can search for duplicate images within an acceptable level of threshold - a feature I find quite handy, especially since I end up shooting lots of pictures of the same subject, with slightly different angles and settings:

Going back to the Album view, there is a pretty decent context menu that allows you to edit the current image (either within digKam itself, or a host of external programs):

There are some pretty decent editing capabilities within digiKam - but more on those later!

My verdict? I think digiKam is the best open source photo management tool out there. It's robust, powerful and best of all, it has API interface (which I have yet to try!). There also exists a more user-friendly interface, Showfoto - which most users will find very attractive. digiKam isn't going to give PhotoShop a run for its money just yet, but it's definitely a notch better than Picasa, and pretty close to LightRoom!

 

2010-12-13

Why ScribeFire has a long way to go

Scribefire is an extension to Firefox, Chrome and Safari web browsers. I had used it in the past circa 2008 (so, an era, in Internet times). While it beats Blogger's (and a number of other, free blogging websites tools including wordpress), I had given up on it last time since I had run into intermittent problems (specifically around upload of images, losing posts while editing, etc)

Fast forward 2 years. I figure it would have had a number of improvements, and I'd be able to use it. Sure enough, I install it (and am pleased that it's available on Chrome as well - which I started using about a year ago). I decided to install it first of Firefox, just to see how things go.

  • I was able to authenticate myself with blogger
  • The interface seemed a bit better than before, even though the pane to write the post is still small and somewhat out of proportion
  • On Firefox, copy / paste of images work, but not on Chrome
  • When a post with images is published on Blogger on Firefox, images do not appear on Blogger.
  • When a post has images added to it on Chrome, it asks for access to Picasa Web Albums. All images are then uploaded to Picasa and referenced in the blog. Wonder why this doesn't happen on Firefox?
  • While trying to resize the "Post Content" pane on Firefox, if you accidentally click on the ScribeFire icon on the status bar, you lose all edits that you have made to your post. Without warning
  • The layout of the buttons (such as "Send to Blog") gets screwed up as you try to resize the ScribeFire panes
  • The small button above the ScribeFire add-on pane to close the add-on doesn't always work on Firefox
  • Since the button layout can get easily screwed up (at least on Firefox), you may find your post gets scheduled instead of being posted to blog
  • Adding links to posts is still archaic. At the very least, it should capture contents from the clipboard, making it easier to add hyperlinks
  • Users preferences should be remembered. For example, if an image is added to a post with orientation "middle", it should remember that as the default choice subsequently
  • Some keyboard shortcuts should be introduced for things such as adding hyperlinks. Where keyboard shortcuts already exist, these should be added to the tooltips
  • Posting with ScribeFire on Chrome doesn't always add a post to blogger. ScribeFire sometimes displays the posts as scheduled, but I couldn't find them on blogger.
  • ScribeFire doesn't seem to be handling the posting schedule properly. Two posts on the same day were posted in reverse order

On my wish list, I would also like to see:

  • A capability to add "References" to my post - this could simply be a bullet list of all links in the post (like another Firefox plugin, Zotero)
  • Ability to integrate with search engines to suggest more references on-the-fly

2010-12-01

Amazon EC2 Costs - A reality check

Ever since Amazon introduced its EC2 virtual computing environment, the disruptive pricing has been crucial to disrupting the market. However, it is quite easy for vendors / System Integrators to think moving to "cloud" would dramatically lower the costs, while in some cases, the costs may go up!

Here's what you see you on the EC2 website when you want to find the costs:


Looking at the pricing in the image above, you'd think the cost aren't going to be that high. Most of the people I have come across, only remember the soundbyte of 34 cents/hour.
However, simple math dictates the yearly costs could be fairly substantial as the table below depicts:
OS
EC2 Instace
Demand Type
Cost / Hr
Hours
Length
Total
Windows
HCPU Extra Large
OnDemand
$1.16
8,736
Year
$10,133.76
Windows
Extra Large
OnDemand
$0.96
8,736
Year
$8,386.56
Linux/UNIX
Extra Large
OnDemand
$0.68
8,736
Year
$5,940.48
Linux/UNIX
HCPU Extra Large
OnDemand
$0.68
8,736
Year
$5,940.48
Linux/UNIX
Large
OnDemand
$0.68
8,736
Year
$5,940.48
Windows
HCPU Extra Large
Reserved
$0.50
8,736
Year
$4,368.00
Windows
Large
OnDemand
$0.48
8,736
Year
$4,193.28
Windows
HCPU Medium
OnDemand
$0.29
8,736
Year
$2,533.44
Linux/UNIX
Extra Large
Reserved
$0.24
8,736
Year
$2,096.64
Linux/UNIX
HCPU Extra Large
Reserved
$0.24
8,736
Year
$2,096.64
Linux/UNIX
HCPU Medium
OnDemand
$0.17
8,736
Year
$1,485.12
Linux/UNIX
Large
Reserved
$0.12
8,736
Year
$1,048.32
Windows
Small
OnDemand
$0.12
8,736
Year
$1,048.32
The table above lists the costs of various "instances" of the compute units offered by Amazon, the hourly costs for them, and the corresponding yearly costs. These "instances" can have either Linux/Unix OS or Windows (priced differently), and you can even choose to "reserve" an instance for your use, which may even save some money. 
Clearly, computing cost on EC2 isn't as cheap as it seems initially. For example, something called an "Extra Large" instance will cost upwards of $8,000 per year, while "HCPU Extra Large" will cost over $10,000. Further, it isn't clear what this "instance" is, and what is included in an instance. So a little more digging is necessary to determine what is meant by, or included in, various instances. Here's a handy table describing the instances:

Instance
Memory (MB)
Virtual
 Core
ECU
ECU per Core
Storage (GB)
I/O
Platform
Micro Instance
633
1
2
-

 

 

32/64 bit
Small Instance – default
1740.8
1
1
1
160
Moderate
32 bit
Large Instance
7680
2
4
2
850
High
64 bit
Extra Large Instance
15360
4
8
2
1690

 

64 bit

 

 

 

 

 

 

 

 

High-Memory Extra Large Instance
17510.4
2
6.5
3.25
420
Moderate
64 bit
High-Memory Double Extra Large Instance
35020.8
4
13
3.25
850
High

 

High-Memory Quadruple Extra Large Instance
70041.6
8
26
3.25
1690

 

64 bit

 

 

 

 

 

 

 

 

High-CPU Extra Large Instance
7168
8
20
2.5
1690
High
64 bit
Cluster Compute Quadruple Extra Large Instance*
18432
8
33.5
4.1875
1690
Very Large
64 bit
Cluster GPU Quadruple Extra Large Instance **
18432
8
33.5
4.1875
1690
Very Large
64 bit
The above table lists the various instances offered by Amazon, the memory included (in MB) for each instance type, the virtual processors for each instance, storage (in GB) and whether the instance is available in 32 or 64 bit support. Also provided is a breakdown of Elastic Compute Unit (or ECU), per virtual processor core. The idea here is to provide all permutations of the most common computing needs that the large majority of consumers would have, and offer computing in those flavours. 
The GPU instance is a recent addition, providing graphic intensive core, therefore obviating the need for small businesses to buy specialised servers to perform graphic intensive tasks (such as computer aided animation or design, etc). However, you'll need to dig further to determine exact what is a virtual processor - and this is where wikipedia comes to rescue. An EC2 Compute Unit (in other words, the virtual processor in each of these instances offered by Amazon) is roughly equivalent of a 1.0 to 1.2 Ghz 2007 Xeon or Opteron processor. It has a CPU passmark of approximately 400. However, some websites report a lower passmark.
Armed with all that information, if you then decide to request Amazon for provisioning a set of instances to serve up your website, storefront, etc., beware! There are a lot more elements that make your final (monthly) bill. Here's a quick peek at the most common customer examples available at Amazon EC2 website:
Sample Scenario
Instances
EC2 Total
S3 Total
VPC total
DB Total
Cloud Front Total
Discount
Data
Total
Marketing Web Site
2 Large
1054.08
605.72
0
0
197.8
-4.5
74.8
1927.9
Web Application
5 Large 2Xlarge
4256.41
2771.12
471.5
434.9
0
-24.52
2274.19
10183.6
Media Application
None
0
42.02
100
0
853.33
-1.85
1.15
994.65
HPC Cluster
None
0
780.76
5510.4
0
0
-4.45
1117.7
7404.41
Disaster Recovery & Backup
None
1610.4
364.07
0
0
0
-4.47
318.05
2288.05
European Web App
None
2191.58
98.66
21.79
21.79
0
-26.32
463.95
2771.45
If you think the table above is scary, then..well yeah, it is. It contains a flavour of the variables that Amazon uses to calculate the price per month that it will charge you, and clearly there's more to the cost than merely the compute currency. Some of the key elements that go into it are:
  • Compute Unit (as we discussed)
  • Usage (in hours)
  • Location of usage (Amazon offers US East Coast, US West Coast, Europe and Asia Pacific
  • Additional Amazon services used, such as:
    • Elastic IP address remapping
    •  Amazon Elastic Block Store (EBS)
    • Elastic Load Balancing
    • Amazon CloudWatch
    • Amazon Web Services
    • Amazon S3
    • Virtual Private Cloud
    • etc...
Clearly, there is a lot more to Amazon's pricing than the promise of 10 cents/hour. Perhaps these are the variables that every Data Center administrator has to work with, when planning to install new equipment. However, to some extent, this disruptive pricing and offering requires a fairly steep curve for rest of the market. I suppose the initial adopters would be technology enthusiasts AND more importantly, competitors, which would do nothing but add fuel to the fire of Amazon's advertising for this offering! 
Next time - more details on the EC2 pricing!
–Elast