The term ML5 and TensorFlow and PoseNet has been loosely tossed around in the DF studio nearing the end of this semester from my peers in Body-Centric Tech and perhaps I should be feeling lucky to finally dive into it, regardless of the stress from living deadline to deadline 🙂 But, snipes aside, this was a whole other realm of knowledge that was extremely new to me, and I decided that by getting a better sense of what ML5, etc. is, that I would find its true appeal. Below are key bullet points I made for myself:
- ML5 – Machine Learning 5, in which it is symbiotic to P5.
- ML5 doesn’t really need p5… but it makes things easier for you, in order for you to use Tensorflow, the mother to Tensorflow.js.
- ML5 deals with a lot of pre-trained models.
- ML5 library doesn’t know a lot about Humans.
After looking at all the juicy examples, I wanted to explore the capabilities of the Image Classification with MobileNet, since it is “trained to recognize the content of certain images.” I wondered to what extent is MobileNet been trained to, and how? In what context? How have its creators trained it, and Who were they? What kind of MobileNet am I tampering with here? Has it perhaps been… westernized? Or just based on completely obscure datasets?
With my burning questions, I played around with the ml5 index page’s example with the drag&drop/upload an image. Below are some of my discoveries when it came to images available on my desktop.
Male Japanese Celebrity is a diaper/nappy, A Sim from the Sims 4 is a bathing cap apparently, and the Poliwag from Pokémon is a CD Player. Great! A lot of the ML confidence decreased, when it came to anything with human-like features. I continued to upload a lot of other screenshots I’ve taken from animated/drawn media and it came to an interesting result:
Game Fanart, Spongebob and A cover from the Case Closed manga were ALL considered Comic Book with varying degrees of confidence. I started to realize how even images that are completely different can all be classified under one type of image. From there, I decided to upload pictures of myself (Yes, I volunteer! As Tribute!) because I was entirely curious as to how MobileNet would classify by appearance.
Surprisingly, I am Abaya. The funny thing about MobileNet classifying this as Abaya is how completely specific this word is to the middle-eastern context, where culturally, women of the Arabian Gulf would wear a black dress/cloak called an Abaya in their daily lives. I would think that only the first image could be classified Abaya due to the black material I am wearing on my head, but then it did it for my lighter colored one. Regardless of its confidence in recognizing me, there are varying percentages of how sure it defines the head scarf as Abaya, instead of say, Hijab. I wanted to see if this was the same for random images on the internet, and it turned out to be true.
Random lady with headscarf is more of an Abaya than the model with the Nike Sports Hijab, but both still turned out as Abaya. How Fascinating! I decided to take this further by setting up a webpage to see the other probabilities it would consider images of majority “Abaya” would be.
Based on the Array results in the console, the results OTHER than abaya were significantly lower. Abaya was 66% while Cloak was at 29% and Sweatshirt was at 5%. I noticed that Cloak would be the next way in which I would be identified and then Sweatshirt. I tried this with a few other images and the results are below:
With different images, the third identifier in the array was Ski Mask and Bath Towel, which made Sweatshirt not a common third one. But yet, when I loaded the random images below:
Sweatshirt came out as a second result. So we have Abaya, Cloak and Sweatshirt to identify a woman wearing a headscarf, which is very interesting to me since none of the results were specific words such as the Hijab or Headscarf. MobileNet didn’t identify them as they were known as, which is to be worn on the head for religious reasons. Perhaps a good take away from this is that the machine knows no other concept than what objects/animals/types of things that the image might be similar to, which generalizes? the experience more than it specifies, regardless of how weirdly specific the identifier results were. In itself, perhaps the MobileNet dataset is Paradoxical in nature.
To step it up a notch, I decided to turn on my webcam to see if MobileNet can identify my appearance in real time, and how different the results would be as opposed to Abaya (or Cloak or Sweatshirt.)
It was interesting to investigate the capabilities of MobileNet in this manner to see if ML would elicit certain biases, of either political or sociological. What turned out was that it is quite innocent (for now) and that I may well be identified as an Abaya, a Bonnet or even a Bandaid or a Mosquito Net. These terms could even be what one would prefer to be identified with, rather than how mass media would choose to use certain terms to describe a woman with a headscarf (or anyone as near to how news outlets would describe someone who is of Islamic faith.)
To conclude, the varying probabilities that MobileNet would give us in real time could very much be a reflection to how different even people within faith can be, from the extremes to people just simply living their lives, and should not be placed into just one universal definition.