```Blobby Land
Gordon J. Pace
Chris Porter
Christian Colombo
March 2010
Who killed Mr Blobby?


Mathematics and (partly) Computer Science
are really all about one question: What can
we say about objects which can be described
in a finite manner?
Compression is usually done by trying to find
regularities in the source:


Finding concise rules which generate part of the
data bigger than the rule.
Find common subparts and describe them once
Mr Blobby is innocent

Applications of compression are numerous:



Smaller data for transmission and storage e.g. fit
a whole movie at high resolution on a single DVD.
Less resources if the description we are
compressing is of a real life process: e.g. process
chain optimisation.
Allow you to win the 2010 ICTSA Programming
Challenge
Mr Blobby woz ’ere

Given:



an image with a 3-bit depth (8 colours)
a number n
To produce:



not more than n blobs of paint approximating the
original image
in less than one minute.
using as little as possible paint
From Mr Ducky…
… to Mr Blobby
… to Mr Blobby
In 16 blobs
… to Mr Blobby
… and no, I did
not produce this
automatically
Some Interesting Notes

The problem of finding an optimal solution is hard:




There’s an infinite search space (allowing fractions, and
blobs with centres outside the grid)
But could be reduced to a finite one if we ignore the less
paint is better constraint.
Someone asked “is it NP-complete?” It is in NP (so
certainly not worse than NPC), and seems to be in
NPC, but am not sure.
A brute force approach is out of the question
Here’s an evaluation image (given
with 5, 50, 100, 1000 blobs)
Here’s what one team
produced (1000 blobs)
and another team (1000 blobs)
and yet another (1000 blobs?!)
The Teams
The Teams: Industry-Based
1.
2.
3.
4.
5.
6.
7.
CasaSoft: Karl Cassar, Mark Cassar
Exigy Ltd: Christian Tabone, Clint Spiteri, Bernard Cachia,
Jean-Paul Sultana
Ixaris: JP Ebejer, Jimmy Borg, Brian Vella, Brian Attard Bason
Gigi tal-MITA: Daniel D'Agostino
CCBill EU: Emerson Farrugia, Christopher Torpuland, Briun,
Holistic: Ruben Gatt, Matthew Falzon, Shaun Camilleri, Luke
Rocco
Atlas Insurance: Karl Fenech
The Student Teams
8.
9.
10.
11.
12.
13.
14.
15.
16.
Iš Vilniaus: Michael Debono, Andrew Borg Cardona
papa charlie: Ian Agius, James Scicluna
WM\{James}: Ingram Bondin, Kevin Falzon, Andrew Gauci
daemon: Kurt Farrugia, Stefan Lia
AJaZ: Andrew J Said, Julian Zammit
C^D: Claire Dimech, Daniela Cauchi
y.a.t.n.: Darren Demicoli
TDGS: Kyle Pullicino, Adrian Duca, Kurt Portelli, Paul Felice
B3ANi3S: David Borg, Duncan Borg, Thorsten Callus
The Teams and their Solutions
01: CasaSoft







Implemented using C# (.Net Framework 3.5)
Based on a genetic algorithm.
Reduced search space by specifying a boundary in which blobs
can be placed and using bounded accuracy.
Each generation of the genetic algorithm results in the best found
blob according to the population which gives the best score. The
genetic algorithm continues if there is still time available.
The population size and the number of epochs per generation
are calculated as a function of the bitmap size.
An optimization to automatically calculate the largest square that
can fit in the circle, enabling easy flagging of pixels to be painted.
The fitness function does not to check for similar pixels, so as not
to slow the speed of the algorithm.
An Evaluation Problem (50,
100, 1000 blobs)
And their solution (50 blobs)
02: Holistic

Implementation language C#

Used a basic Genetic Algorithm to
approximate the result.
No fine tuning and optimizations were done
on this algorithm.
Alas, all problems timed out 


03: Ixaris

Approach:






Based on genetic algorithms (where genes are blobs)
Initial population based on each colour layer being considered as one blob (and 25
random mutants generated on this)
Genetic Operators – split (gives us two blobs), swap blobs, forceful remove, remove (if
the blob has little effect), tweak radius, tweak centre position.
Changing probability distribution for the above
Multi-Threaded – reproduction of chromosomes done in parallel because of inherent
independence from each other
Issues




Loads of magic variables (e.g. children count, max fit population, noise filter threshold,
initial population, convergence limit, probability distributions for genetic operators)
Interesting note: how to find good approximations for these (in a few hours)? (Neural
Networks, Approximate Bayesian Computing for parameter estimation seem too
onerous)
As it very so often happens, last minute improvement fixes actually broke the solution
(image reflected along x-axis – damn cartesian coordinate system)
Don’t laugh. (Fixed at 8:20pm – too late).
An Evaluation Problem (5, 50,
100 blobs)
And their solution (50 blobs)
04: Gigi tal-MITA

Implementation language: C

Using a floodfill approach, identify connected areas of the same colour.
Count the pixels in each area, order them in descending order of number of pixels, and
calculate the centroid of each group.
Fit circles into the image:

Start at the centroid with a circle of maximum radius for that area.

Decrease the radius until it has a pretty damn high level of inclusion (defined by a
threshold variable)

Inclusion is good if it includes that particular colour or even any other - as long as it's
not white (background colour)

Blob the circle with the best fit
Repeat these steps, renumbering pixels which still have not been blobbed.




On Sunday morning I realised I did not have enough time to complete the algorithm as I
would have wanted to. Realising that submitting an average solution is better than
submitting nothing at all, I went for a compromise - I simply used a circle of average radius.
Although this leaves room for a good deal if inaccurate pixels especially if the area is noncircular, overall it has given me an accuracy of 50-69%, which I guess is reasonable for an
approximate solution.
An Evaluation Problem (5, 50,
100 blobs)
Their Solution (100 blobs)
05: Pr0nsters Inc. CCBill EU

Our solution has the following main stages:







Start by detecting colour edges.
Next, using the edges, generate a Voronoi tessellation diagram to create a
number of polygons giving the locations where cells meet.
These are transformed into circles.
Clustering/filtering mechanism:




Edge detection
Voronoi generation
Clustering/filtering
Take the most frequent colour and use that as our base blob to cover the image.
Calculate circle overlaps - anything more than a fixed percentage gets dropped
(changing the threshold until the number of blobs is reduced as required).
Stuff we didn’t manage to complete in time:

A better clustering mechanism by use of the DBScan algorithm to detect “noise”
circles. This was implemented but was too late to be merged in the main program
An Evaluation Problem (5, 50,
100, 1000 blobs)
Their Solution (1000 blobs)
07: Atlas Insurance
Technologies: C#, .NET Framework 3.5, multithreading.

Heuristic driven using image resampling:




From the smallest downsample containing as-yet-unmatched pixels, randomly
pick one unmatched pixel, and use it as the bounding square for a blob which
would set it to the correct color.
Repeat the following until the largest upsample has been processed:





Bitmap is downsampled, by a factor of 2, to a minimum resample area of 1×1.
And also upsampled, by a factor of 3, to a maximum area of 2048×2048.
Compute the respective similarity gains obtained from transforming the bounding
square by 1 pixel (only) in all directions, and pick the best transformation as the
improved bounding square.
Multiply the bounding square’s dimensions and bottom-left coordinate by the stepwise
resample factor (2 or 3) so as to translate it onto the corresponding pixels in the next
(larger) sample.
Add the blob and go to the next unmatched pixel…
Downsamples guarantee convergence of optimal bounding square for next blob
in O (log2 n) time.
Upsamples help by achieving sub-pixel precision on the original bitmap.
An Evaluation Problem (5, 50,
100, 1000 blobs)
Their Solution (1000 blobs)
08: Iš Vilniaus
Implementation Language: C#

Heuristics and Algorithms







A modified maximum number of blobs is used in generation.
For performance, a simpler similarity function to the original bitmap was employed that only
considered the number of absolute correct pixels.
Method 1: To generate timer-safe blobs, the center of the rectangular area defined by each
component is used as the centre of a blob.
Method 2: A Genetic algorithm





Images larger than 100x100 pixels area were resized down to fit an area of 100x100.
The image is split into single colour components using a floodfill algorithm.
Components are considered largest first.
Chromosomes were configurations of a random number of components up to the limit, and the
number of splits for each component.
Each split of the components became a blob and the Center of Gravity calculated from the area of
the split sub-component. The radius was calculated from the width and height of the split and also
from the difference to the center of gravity.
The similarity function is used as a fitness function.
Termination upon timeout or convergance.
The best solution is used
Evaluation Problem (5, 20, 100
blobs)
Their Solution (5 blobs)
09: Papa Charlie


Language of choice: Java
The algorithm:




Divide the bitmap into components, each holding a colour
area;
The similarity calculator was used to find the best result on
a component;
Until the maximum number of blobs is reached, repeat the
following procedure:
 Remove components with a small number of correct pixels
 Subdivide components with the most wrong pixels
Heuristics to cater for special cases were also created.
Another Evaluation Problem
(5, 50, 100 blobs)
Their solution with 100 blobs
10: WM\{James}

A suite of 3 algorithms, run separately, and the best result
chosen.





Colour Based Overapproximation: For each colour (starting with
the most frequent one), the centroid of the extremal points of the
points where the colour appears is calculated, and a blob drawn
covering the whole area. Very fast but crude algorithm.
Cluster Based Overapproximation: Contiguous clusters of colours
are ordered by size, largest first. The algorithm then assigns blobs of
colour to each cluster until the blobs are exhausted, using the
centroid approach. Increased cost to generate the clusters.
Cluster Based Overapproximation and Merging: Build upon
previous by also merging clusters which have not been assigned a
blob to an existing blob.
Result is also checked with a different blob ordering (biggest
first).
There are also minor trigonometric optimisations.
Another Evaluation Problem
(with 5, 20, 100 blobs)
Their solution with 20 blobs
11: daemon


“We used java to implement our blobify. The
approach we took was a genetic algorithm.”
All the problems timed out 
12: AJaZ
Tackled the problem in C# on .NET Framework 3.5


If there are enough blobs, put a blob of radius 0.0001 (we liked the way such a
blob can cover a pixel!) at the centre of each pixel.
If not, use a Genetic Algorithm




Large images (>2000 pixels), are scaled down to around 1000 pixels.
Places blobs randomly (with no colour)
Colour is calculated to be the most common color of the visible part of the circle
Mutation:






Blobs with low amount of correct pixels are removed
Blobs that overlap on a blob of the same color are removed
Other Mutations: Change position, Change Radius, Change Blob order
chances for each one are variable and change depending on which produces the most
"better" mutations. (Sort of tailor making the mutation for every different bitmap)
The last 5 seconds spend mutating the best solution
Stopping condition: 57 seconds (excluding input and output)
Guess the original image
(1000 blobs)
Che Guevara
(5, 50, 100, 1000 blobs)
13: C^D


The blobmap is divided into monochrome blobmaps.
With each monochrome blobmap:





Extract a rectangle from this map such that all colored
pixels are within it.
Create a blob with radius half the smallest side of the
rectangle.
If half the pixels in the rectangle are coloured then use the
blob
Otherwise, split the monochrome map and repeat. Different
splitting mechanisms were used.
The blobs used are the biggest ones generated.
The smiley faces ended up as…
this using 100 blobs
14: y.a.t.n.

Implementation language: C

Randomly place blobs in areas of the image which
use the same colour.
The best blob (in terms of correct pixels and paint
used) is kept.
The remaining blobs are randomly refined by adding
blobs in positions which enclose the same colour of
the blob being put and again assessing the
similarity.
Repeat until the algorithm converges or until 50
seconds have elapsed.



Guess who this is (1000 blobs)
5, 50, 100, 1000 blobs
15: TDGS




Language used: Java
The size of the input bitmap is increased to a square
with sides of length a power of two
The bitmap is then split into multiple grids of various
sizes to determine the occurrence of colours at
various areas of the image. In those areas where
the occurrence of a colour is higher than a certain
threshold, a blob is added to cover that area.
The grids in an outward spiral fashion (“just to be
cool”)
Another Evaluation Problem
(5, 50, 100, 1000 blobs)
1000 blobs
16: B3ANi3S

Tried to emulate the way a human would try to solve
the problem:




Recognize the basic shapes in the picture by reading the
bitmap line by line, storing the coordinates of the start and
end points of each line having the same colour, and
eliminating errors through sampling.
From the coordinates collected earlier, calculate the
optimal blob position and size which will fill all the shape
with least overflow, and if the total number of blobs is
exceeded, remove the smallest blobs.
Order the blobs by checking the perimeter of the shape in
the original image and seeing which colour is best on top.
No working solution submitted 
Some other impressive solutions
More impressive solutions
Some other impressive solutions
Some other impressive solutions
Some other impressive solutions
Some other impressive solutions
Some other impressive solutions
Some other impressive solutions
But who won?
Student Teams
#
Team Name
Score
Student Teams
#
Team Name
09
papa charlie
Score
720
Student Teams
#
Team Name
Score
08
Iš Vilniaus
765
09
papa charlie
720
Student Teams
#
Team Name
Score
10
WM\{James}
08
Iš Vilniaus
765
09
papa charlie
720
1075
Student Teams
#
Team Name
Score
15
TDGS
1790
10
WM\{James}
1075
08
Iš Vilniaus
765
09
papa charlie
720
Student Teams
#
Team Name
Score
12
AJaZ
2685
15
TDGS
1790
10
WM\{James}
1075
08
Iš Vilniaus
765
09
papa charlie
720
Industry Teams
#
Team Name
Score
Industry Teams
#
Team Name
04
Gigi tal-MITA
Score
455
Industry Teams
#
Team Name
Score
03
Ixaris
465
04
Gigi tal-MITA
455
Industry Teams
#
Team Name
Score
05
CCBill EU
03
Ixaris
465
04
Gigi tal-MITA
455
1595
Industry Teams
#
Team Name
Score
01
CasaSoft
1710
05
CCBill EU
1595
03
Ixaris
465
04
Gigi tal-MITA
455
Industry Teams
#
Team Name
Score
07
Atlas Insurance
3380
01
CasaSoft
1710
05
CCBill EU
1595
03
Ixaris
465
04
Gigi tal-MITA
455
Merging the Results
#
Team Name
Score
07 Atlas Insurance
3320
12 AJaZ
1260
01 CasaSoft
840
05 CCBill EU
830
15 TDGS
690
Prosit tal-Programmi!
Hope to see you again next year…
To check the evaluation problems were unchanged:


Take the encrypted file
1.
2.

Il-materjal li inbeżaq mill-impatti ikkrejaw krateri oħra
sekondarji. Xi krateri għandhom kisja ta' l-eġekta (materjal) li
jifraq għal mijiet ta' kilometri mill-punt tal-impatt.
Rename to .zip
3.
4.

Il-Qamar huwa l-unika satellita naturali tal-Art. Jorbita il-pjaneta
tagħna minn 350,000 għal 400,000 km, jakkonpanja l-Art fittidwir madwar l-istilla Sol (Xemx).Ġieli jissejjaħ Luna (millKelma Latina għal qamar), jew Selene.
Thanks to…
```