Explorations #3

Technologies and short projects

Slit Scan with ml5.js

These last couple weeks I have been looking at different aspects. One of the things I tried out was looking at slit scanning. I wanted to incorporate slit scanning with ml5.js. With this experiment I wanted to observe the machine trained model to recognize faces if the participant stood still.

This did not seem to work so well. The slit scan work but the way the frame was spliced by p5.js in the code made it hard for ml.5 system to work.


CLM Tracker

This week we had also started looking at CLM tracker.

clmtrackr is a javascript library for fitting facial models to faces in videos or images. It currently is an implementation of constrained local models fitted by regularized landmark mean-shift, as described in Jason M. Saragih’s paper.”(https://github.com/auduno/clmtrackr).




CLM tracker is better at tracking the face and it works on many points of the face. Due to the significant detail in mapping out the face, CLM tracker has some unique capabilities in being able to map out interesting shapes onto the face. This code was released in 2016 and this is almost a precursor to some of the filter technology that exists now.


This key unique feature shows the mapping of shapes on the face.


Sometimes the tracker works well however most of the times the tracker takes time locating the face. It isn’t as steady as ml5. This makes it hard to code with it as we figured it  out while trying map shapes onto each key point.


One of the unique codes create Clm tracker which is the emotion reader. The emotion reader which recognizes the emotion based on how each key point is programmed to be in a particular face. The interesting insight for me was the fact that having my facial emotions recognized allowed me to make my face take on different features to take on different emotions. I also wanted to make some notes on what the potential for this kind of technology could be for the future. 
This kind of led me to research further into this topic. I wanted to kind of see what is out there in terms of the pros and cons of this type of technology. As clm tracker is open source and is a library which is compatible with p5.js it isn’t as sophisticated as to what is out there. Emotion recognition is a subfield within the facial recognition industry which is said to grow from 19 billion to 37 billion USD.(https://neurodatalab.com/blog/how-do-technologies-recognize-our-emotions-and-why-it-is-so-promising/)


Open frameworks

One of our goals throughout this process was to also look at openframeworks. Openframeworks which is a program written in c++ allows for creative coding. We had looked at some of zach lieberman’s videos which earlier in the course which gave us interesting insights into the openframeworks platform. We downloaded open frameworks and Xcode which is the IDE to be used for the code. The advantage is of starting to use openframeworks is that we move away from the browser which is the main disadvantage with p5.js. 


We tried out the examples. The first one was the blob example which is great for hand tracking.

Explorations #2


Weeks 3 and 4 were dedicated to running short experiments and continuous research on projects that were interesting in the way that combined the body and computer vision.

I have divided the blogpost up into:

  1. Mini-experiments
  2. Research

1. Mini-Experiments

For the min experiments I started out first by simply sketching out body points using posenet and Ml5.js library.



I referenced an example sketch on the pose net library to draw a skeleton first. The idea of the sketch was to understand the perspective of how the points connect to do further exploration. I want to see how I could make two point collide in order to make something happen. This “something” could take place in the form of a sound or a visual.

I continued the exploration by focusing on the facial tracking. As I am drawn  to the idea of the facial recognition.


These snippets of trying out the code were to help with understanding all trackable points.

Another experiment I tried was Bodypix(). This is a code which only tracks the body and blurs out anything that is not the body. I wanted to see if instead of dull black background I could use a function to help me load and image which would act and the background. Since it is real time and dynamic, in which sense the sketch is always dynamic and moving the code draws out the color pixels every time.


The code was pretty good at recognizing the body however it wasn’t steady enough. It kept tracking the painting behind me as the body as well.


The code for this was:

function setup() {
createCanvas(320, 240);

// load up your video
video = createCapture(VIDEO);
video.size(width, height);
// video.hide(); // Hide the video element, and just show the canvas
bodypix = ml5.bodyPix(video, modelReady)

function modelReady() {
bodypix.segment(gotResults, options)

function gotResults(err, result) {
if (err) {
// console.log(result);
segmentation = result;

image(video, 0, 0, width, height)
image(segmentation.maskBackground, 0, 0, width, height)

bodypix.segment(gotResults, options)


Using an Image as a background did not work. Nor did other colors other than black. This is something I would like to come back to in the coming weeks for further exploration.


2. Further research

Body, Movement, Language:AI sketches with Bill T.Jones

This experiment conducted with tensorflow and built by the google creative team and famous choreographer Bill T. Jones. This was a remarkable experiment in the sense that it used real time track and speech recognition as a way to track and move words in real time. This opens up the possibilities of storytelling through this medium.



Space-Time Correlations Focused in Film Objects and Interactive Video 

I wanted to see how I could also understand these minexperiemnts through a theroetical lens. I really wanted to understand the idea of time and space through this medium. I also got interested in the applications of these technologies and the future of cinema. Will the future of cinema change due to our experimentations with these new technologies?

A number of contemporary and more recent art projects have transformed film-material into interactive virtual spaces, in order to break through the traditional linear quality of the moving image and the perception of time, at the same time to represent, or to visualise the spatial aspects of time respectively. In the times of resampling, the concentration on an relatively old picture medium and its transformation into a space-time phenomenon open to interactive experience does not seem surprising. The results of these experimental works exploring and shifting the parameters of the linear film are often oddly abstract and quite expressive in their formal composition, and, consciously elude simple legibility.


This blog post is set to be an exploration for what would be 12 week study on computer vision but very specifically “using body as a controller”.  The first two weeks we conducted broad research which involved looking at:

1- Technology

2-Lit review

3-Artists in Computer Vision

4-Computer vision in mobile technology

5-Ethics and design of computer vision



We got the chance to experiment with the kinect 1. Due to the library written by Daniel Shiffman for using Kinect in conjunction with Processing (using Kinect with Processing). We set up the kinect and ran the example codes provided in the library. This helps in understanding the code as well as quickly try out possibilities of using the kinect in creative coding. An interesting question which arose during our experimentation with the kinect was the idea of using it as a camera to capture narrative in real time akin to a film camera.

2- Research (Literature Review)-

There are many artists and designers who have used the idea of computer vision in a variety of ways to explore the possibilities of computer vision in combination with machine learning. Below is a brief list of research we conducted to learn different approaches taken around the subject.


Fei-Fei Li: If We Want Machines to Think, We Need to Teach Them to See

“Computer vision, Li argues, is the key enabling technology for all of AI. Understanding vision and building visual systems is really understanding intelligence”.

Computer Eyesight Gets a Lot More Accurate      BY JOHN MARKOFF AUGUST 18, 2014   https://bits.blogs.nytimes.com/2014/08/18/computer-eyesight-gets-a-lot-more-accurate/   

“Machine vision has countless applications, including computer gaming, medical diagnosis, factory robotics and automotive safety systems. Recently a number of carmakers have added the ability to recognize pedestrians and bicyclists and stop automatically without driver intervention”.


 If you’re a darker-skinned woman, this is how often facial-recognition software If you’re a darker-skinned woman  https://www.media.mit.edu/articles/if-you-re-a-darker-skinned-woman-this-is-how-often-facial-recognition-software-decides-you-re-a-man/

“Automated systems are not inherently neutral. They reflect the priorities, preferences, and prejudices—the coded gaze—of those who have the power to mold artificial intelligence”.


Cornell University CS5670:Intro to Computer Vision

Cornell University CS5670: Intro to Computer Vision Instructor: Noah Snavely

This is a pdf from a class offered on Computer Vision. It gives a useful guide to what computer Vision entails.


  1. Will computer vision ever be as good as human vision?

Most probably not as human vision is better at perception. However due to some shortcomings in the human perception, Computer vision can detect vast number of things and aids humans in many activities. I picked a few slides and have attached the areas I find the most interesting in the vast field.


This may not be much of a surprise since most phones these days use face detection technology  unlock our phones. The interesting question was why do phones detect faces in real time? I couldn’t find a satisfactory answer but I found some informative on the lieu of information now available around the applications which have spurred especially face detection applications and social media companies like snapchat.

Computer Vision in Mobile Phones

Computer vision and the future of mobile devices

Google’s project tango



This area was extremely exciting to me in the exploration angle.  In the last few years we’ve seen a rise of facial filters and facial tracking. Recently facebook’s company instagram allowed computer vision artists to become developers to release facial filters.

2.What are the current state of affairs?

A lot of the applications have been developed in the last five years, hence making this a very active field of research.

For a list of companies and research active in this field David Lowe’s website maintains a comprehensive list. This list is mostly dedicated to corporations actively working in the field.

Most of these companies work in building large databases to aid in the machine learning and training. A glance to the list and further into the websites was informative personally to me in understanding how corporations are using this technology.

3. Artists in Computer Vision

Golan Levin

Golan Levin in many ways is an artist who is the most well known for combining computer vision with creative coding which asks the question of interaction.

Golan Levin has also been instrumental in collaborating with many artist within this field. His work is the basis on which a lot of social media filters were built upon.

For a list of his work his website flong provides a list of all his projects.

One of my favourite projects is :


Kyle McDonald


This video gives a great introduction to his work.

Dan Oved

Dan Oved’s final project at NYU’s ITP


Background: for my final project for Intro to Physical Computing and Design for Digital Fabrication at ITP, I created Presence, a kinetic sculpture that moved in the direction of a viewers gaze. It used a gaze detection convolutional neaural network to estimate gaze position for a single user. I showed this installation at the 2017 ITP Winter Show.

Lisa Jamhoury – Artist


Lisa Jamhoury’s works with facial and pose recognition software to build interactive art through creative coding.

Her work with Kinect2 helps connect libraries to different editors such as P5.js and processing.

Isabel Hirama – algoRhythmic Salsa (PoseNet AI Assisted Salsa Lessons)

Real time feedback using a laptops camera to teach salsa.

Stanford and MIT both lead the way in terms of research into neural networks, machine learning and training of computer vision.

Stanford Vision Lab


MIT Media Lab



Lisa Jamhoury – Artist


Lisa is a senior experience designer in machine intelligence at Adobe. Lisa’s work uses data from the kinect 2 through a library developed and API developed by her called kinectron. This is an interesting application in terms of the ability to control multiple kinects in multiple locations.



Isabel Hirama – algoRhythmic Salsa (PoseNet AI Assisted Salsa Lessons)



Joy Buolamwini


Joy Buolamwini is a poet of code who uses art and research to illuminate the social implications of artificial intelligence. She founded the Algorithmic Justice League to create a world with more ethical and inclusive technology.Gender shade-


Gendershades is a project by Joy Buolamwini which focuses on the idea of “coded gaze”. Joy’s research focuses on the biases within these systems based on the training dataset that was available. The research data showed that these classifiers are more likely to miss gender and miss skin shades.

Zach Lieberman


Zach Lieberman is the founder of Openframeworks. His most famous project is the eyewriter which uses the eyes to draw. The design of this project focusing on harnessing computer vision and Augmented reality to enable people with disabilities to draw in real time.



3- Computer Vision in Mobile Technology

Computer vision and machine learning models in the last five years have gotten better. One of the reasons is due to the rise of mobile technology and various applications including Social media platforms. In the last few months of 2019, platforms like instagram have opened up their platform to artists to make “new filters”. This included a release of Facebook proprietary software called Spark AR. I downloaded spark to understand how it works. Sparks’ facial recognition software is very good, especially in terms of facial recognition. I tried out the features, this is interesting as it opens up artists in this field with no previous coding knowledge.

Trying out Spark AR face tracker
Trying out Spark AR face tracker



Spark AR tutorial
Spark AR tutorial


There are many artists using this platform including artists like zach Lieberman . I found this a potential area for further research. To understand it further I conducted an experiment where I would try out a different filter everyday and post it on my social media. As somebody is very averse to the idea of selfies, I felt extremely comfortable behind these “masks”.

This made me think of research questions and thoughts that I would like to carry forward for this class.

a) With body and face tracking models becoming extremely sophisticated and consumable vis social media, what does it mean for the future of human body.

b) The idea of digital avatars and cyborg identities being controlled by the human body.

C) The body as art. The rise of facial recognition artists building AR filters which are now displayed on the body via the mobile phone cameras. Are bodies going to be depositories for art?

d) With large corporations like facebook opening up this to large masses of people what are the ethics for the body being involved. Who owns the data of the body. How is this data being further used for large corporations to train Machine vision?

4-Ethics and Design

This field of science and technology brings about ethical questions of the idea of body as controller.

One of the goals of the first week was to also seek out those conducting research and making works which are conducted within the umbrella of computer vision that necessarily don’t fit the status quo and hence are not in the “mainstream” discussions. In my further research I specifically wanted to see if there was a feminist discussion around this field. At first I was getting the same resources as we we have listed above. It required a bit of digging within multiple resources to find the kind of work which was interesting.

As our goal for further research was steer clear from a status-quo driven American dominated field. Furthermore, personally my research interest during this independent study was also to find work which uses a feminist lens on topics like Computer vision.

I came across DeepLab.

Deep Lab is a congress of cyberfeminist researchers, organized by STUDIO Fellow Addie Wagenknecht to examine how the themes of privacy, security, surveillance, anonymity, and large scale data aggregation are problematized in the arts, culture and society.

Facial recognition software is everywhere and users readily give out information, in terms of surveillance and body and face as a data, as a designer entering this topic it is very pertinent to be aware and be mindful of how this will impact all those that use these features.