Skip to main content

Thursday 23rd September 2021

 Here's some of what you missed at The Thirsty Robot:

Technical Musings - CBIR

Content-Based Image Retrieval is not a new idea. Various researchers over the years have looked at the concept of using the images themselves as a way of finding images in an image database, and there's also the 'Law' that says that:

'Every 'New' technology is an Old technology that someone new has just (re-)discovered.

Interestingly, Wikipedia puts the 'first seen' data for CBIR in 1993, which presumably means that the CAM (Content Addressable Memory) systems that at least one of the attendees saw back in the 1970s never existed. This may be a manifestation of the 'Nothing existed before the Internet' phenomenon, which limits available sources to those online sources using HTML. 


Photo by Adi Goldstein on Unsplash

But since things keep getting re-invented, and they are new and fresh to at least some people each and every time, then CBIR is obviously ground-breaking, amazing technology that allows databases to be searched by using pictures instead of words... Or does it?

One of the tricky things about pictures, oh, and audio too!, is that is is often hard to describe. So a search term like 'Droopy clock' might get you to Salvador Dali's painting, but might also get you to a 'Droopy the depressed cartoon dog'-themed clock. But this is taking language and turning it into a search term, which is exactly 'using an image to find another image'. So how do you do that? 

One method is to turn an image into something that represents the image, but in a more convenient form. Decent fidelity images of real-world objects, even apparently 2D-objects like paintings, can require huge amounts of data. But how do you turn an image into something smaller whilst still retaining as much information related to the 'content' of the image as possible, whilst removing any surplus or extraneous information? The answer is often called 'fingerprinting'.

Not literally fingerprinting, of course. What is required is a way of capturing the important parts of the content - often called 'feature extraction'. Because fingerprints extract features from the ends of fingerprints, and because they are used as a way of indexing people, then the term 'fingerprint' is often used for the 'descriptors' for images that are produced by extracting features from the image. One simplistic method is to look at the overall colour, or any dominant shapes, or edges of things in the image, etc. But if a descriptor (or fingerprint) is going to be really useful then it needs to be immune to changes in the size of the image, or cropping it, or rotating it, or a black and white image of it instead of full colour, or a distorted image, and an inverted image, or... You get the idea.

So CBIR is often associated with 'invariances'. Rotational invariance, scaling invariance, etc. All of these seek to produce a descriptor (or fingerprint) that does not change when you try to alter it. So for rotational invariance, you might spin the image and take a photo of that, and then use that spun image to compare with other spun images. For scale invariance, then you need some way of making size irrelevant - so instead of spinning the image, you look at the patterns from examining the image from the centre to the edges: a radial transformation. (Note that these aren't the ideal solutions!) The more invariances that your CBIR fingerprinting system has, the better it is at finding an image.

What is fascinating is that with enough invariances, the amount of information required to fingerprint images can get quite low, but the transformations of the image mean that it is no longer human-viewable, and so it is a bit like trying to explain to an Alien what a 'Droopy clock' is. So 'Droopy' is the name of a dog. But what is a dog? What does 'depressed' mean? What is a 'clock'? What is time? And so on. 

An alternative method is to show lots of versions of the image to a computer system, rotated, scaled, colours inverted etc. and to give it the task of finding something that all the images have in common. This 'Machine Learning' also has the problem that when it succeeds in finding two images that match, you have no way of knowing how it did it. A 'black box' that compares two images and says if they are the same image is good for that, but we haven't really learned anything useful about the 'content' of the images, and we don't have the words or numbers to interpret - we might as well be aliens trying to decipher what 'Droop clock' means.

Nutrition - When is food food?

Rather than spend lots of time cooking, or spending time and money going to a restaurant, some busy people have turned to minimising the time and effort of preparing food. There are a number of online companies that will deliver kits for making meals, or pre-prepared meals that require only minimal cooking, or pre-cooked 'takeaway' foods, or foods that have been reduced down to their basic ingredients. Huel is one of these 'minimalistic' but complete nutrition solutions. It contains everything you need to live, but removes almost all of the preparation and cooking effort by having dried stuff that you add water to and drink. A kind of full-food gaspacho. 

Huel sounds like it might be very useful on a busy Thursday night when 'The Thirsty Robot' is about to start and there's no time to cook!

How big can a shed be?


This is both a trick question, and not a question at all. Obviously, a shed can be any size you want. Except that it can't Whilst it is physically possible to make sheds very large (and very small), there may be legislation that limits the size of the shed that you can have - in England it depends on the size of your house.






The wrong time for a good idea...


Things of questionable usefulness...



The Pandemic has changed the world forever...


Media Recommendations

Alice in Borderland Season 2 - Streaming soon on Netflix in the UK

Squid Game - Streaming on Netflix soon in the UK. Not influenced by Battle Royale or Alice in Borderland...

Strangely placed Best-selling Books




Books in the correct places

Cuckoo's Egg - Security

One star on Amazon


More than Two stars on Amazon


DIY Ad-Blocking


---

A lot of discussion happens at The Thirsty Robot. This blog is an edited, biased summary of just a small fraction of the conversation, links/URLs and references that were mentioned. It is an imperfect record and is definitely not complete - for that you should visit The Thirsty Robot!

---

The next online meeting at The Thirsty Robot is on a Thursday in 2 weeks time in 2021 at 7:30pm GMT.






Comments

Popular posts from this blog

Thursday 24th June 2021

 Here's some of what you missed at The Thirsty Robot: Technical Musings Technology Topic - Nostalgia & Alternative Universes Maybe it was the recent Summer Solstice, or the change of weather from 'too hot' to 'unsettled', or the leak of Windows 11 , but The Thirsty Robot attendees were in a nostalgic mood. Microsoft's 'Bob' user interface was mentioned, because it is one of those fascinating 'roads almost travelled' that is probably dominant in an alternative universe - but not this one. The idea of making a computer a 'simple to use' device is an old idea, but there are few good implementations...  So, Bob was the classic 'use simple, familiar metaphors' approach. In this case, the inside of a house, where rooms were devoted to various tasks: https://en.wikipedia.org/wiki/Microsoft_Bob   This was back in 1995, and so high resolution, lots of colours and many other 2020's 'standards' just didn't exist then. To ...

Thursday 25th February 2021

Here's some of what you missed in the discussion at 'The Thirsty Robot': Technical Musings: Eye Tracking:  So what are you looking at on the screen? We discussed the subject of 'Eye Tracking' and wondered if you could do it with the cameras built into laptops nowadays. Eye tracking can be achieved in a number of ways, but the main usual method uses Infra-Red light to illuminate the eye, and then track the reflection from the cornea (the front part of the eye). Other techniques include tracking the retina (the back) of the eye. Some methods require wearing a special pair of glasses... Eye tracking started out as a technique used by UI researchers to see what people looked at in user interfaces, but over time has been used in a number of other applications, including using it to enable disabled people to control things by moving where their eyes are looking (the senses of fighter aircraft pilots are used in many ways as well...).  Good explanation... Wikipedia Eye Tra...

Thursday 14th May 2021

Here's some of what you missed at The Thirsty Robot: Technical Musings Technology Topic - The 5 Stages of Hacking Discussion turned immediately to Security this time. The fifth stage of Hacking was mentioned, in the context of digital forensics. Hang on, you might be thinking, the fifth stage of Hacking isn't Forensics! Photo of one of the 5 Stages by  Antoine Julien  on  Unsplash Let's refresh our memories about those 5 stages first. No, not that sort of stage! - Reconnaissance - Scanning - Gaining Access - Maintaining Access - Cover(ing) Tracks Now, if you are talking about Ethical Hacking, then that fifth stage is often changed - to Forensics. And some people insist on calling it 'digital forensics', to differentiate it from dusting for fingerprints, looking at blood splatter patterns and all those other 'Witless Silence' tropes. (The long-used incorrect spoonerism of a famous BBC crime drama was first given a mass audience in the 1999 Easter Special epi...